MMLU
Unknown
15,908 multiple choice questions across 57 subjects from elementary to professional level.
Benchmark Stats
Models6
Papers6
Metrics1
SOTA History
Coming SoonVisual timeline of state-of-the-art progression over time will appear here.
accuracy
accuracy
Higher is better
| Rank | Model | Code | Score | Paper / Source |
|---|---|---|---|---|
| 1 | o1-preview | - | 92.3 | openai-blog |
| 2 | gpt-4o Massive Multitask Language Understanding. 57 subjects. | - | 88.7 | openai-blog |
| 3 | claude-35-sonnet | - | 88.7 | anthropic-blog |
| 4 | deepseek-v3 | - | 88.5 | deepseek-blog |
| 5 | gemini-15-pro | - | 85.9 | google-blog |
| 6 | llama-3-70b | HF | 82 | meta-blog |