olmOCR-Bench
Allen Institute for AI
7,010 unit tests across 1,402 PDF documents. Tests parsing of tables, math, multi-column layouts, old scans, and more.
Benchmark Stats
Models16
Papers28
Metrics9
SOTA History
Coming SoonVisual timeline of state-of-the-art progression over time will appear here.
Pass Rate
Percentage of unit tests passed
Higher is better
| Rank | Model | Code | Score | Paper / Source |
|---|---|---|---|---|
| 1 | chandra-ocr-0.1.0 7,010 unit tests across 1,402 PDF documents. #1 overall on olmOCR-Bench. | HF | 83.1% | AlphaXiv |
| 2 | infinity-parser-7b | - | 82.5% | AlphaXiv |
| 3 | olmocr-v0.4.0 | 82.4% | AlphaXiv | |
| 4 | paddleocr-vl | 80% | AlphaXiv | |
| 5 | dots-ocr-3b | 79.1% | GitHub | |
| 6 | mistral-ocr-3 Estimated based on 74% win rate vs OCR 2 | - | 78% | mistral-announcement |
| 7 | marker-1.10.0 | 76.5% | GitHub | |
| 8 | marker-1.10.1 | 76.1% | AlphaXiv | |
| 9 | deepseek-ocr | - | 75.7% | AlphaXiv |
| 10 | deepseek-ocr Chandra outperforms by 7.7 points | - | 75.4% | GitHub |
| 11 | mineru-2.5 | 75.2% | AlphaXiv | |
| 12 | mistral-ocr-api | - | 72% | AlphaXiv |
| 13 | gpt-4o-anchored GPT-4o with anchored prompting | - | 69.9% | GitHub |
| 14 | nanonets-ocr2-3b | - | 69.5% | AlphaXiv |
| 15 | gemini-flash-2 | - | 63.8% | GitHub |
tables
Higher is better
old-scans-math
Higher is better
long-tiny-text
Higher is better
base
Higher is better
headers-footers
Higher is better
multi-column
Higher is better
arxiv
Higher is better
old-scans
Higher is better