olmOCR-Bench

Allen Institute for AI

7,010 unit tests across 1,402 PDF documents. Tests parsing of tables, math, multi-column layouts, old scans, and more.

Benchmark Stats

Models17
Papers28
Metrics9

SOTA History

Not enough data to show trend.

base

Higher is better

RankModelSourceScoreYearPaper
1chandra-ocr-0.1.0

Base clean document parsing. Near-perfect

Editorial99.92025Source

headers-footers

Higher is better

RankModelSourceScoreYearPaper
1olmocr-v0.3.0

#1 on headers/footers extraction

Editorial95.12025Source
2chandra-ocr-0.1.0

Header/footer extraction

Editorial90.82025Source

long-tiny-text

Higher is better

RankModelSourceScoreYearPaper
1chandra-ocr-0.1.0

Long documents with tiny text. #1 in category

Editorial92.32025Source

tables

Higher is better

RankModelSourceScoreYearPaper
1dots-ocr-3b

#1 on table recognition

Editorial88.32025Source
2chandra-ocr-0.1.0

Table recognition category. Near-best (dots.ocr: 88.3)

Editorial882025Source

arxiv

Higher is better

RankModelSourceScoreYearPaper
1marker-1.10.0

#1 on ArXiv paper parsing

Editorial83.82025Source
2chandra-ocr-0.1.0

ArXiv paper parsing. Marker leads (83.8)

Editorial82.22025Source

Pass Rate

Percentage of unit tests passed

Higher is better

RankModelSourceScoreYearPaper
1chandra-ocr-0.1.0

7,010 unit tests across 1,402 PDF documents. #1 overall on olmOCR-Bench.

Editorial83.12025Source
2infinity-parser-7bEditorial82.52025Source
3olmocr-v0.4.0Editorial82.42025Source
4paddleocr-vlEditorial802025Source
5Qianfan-OCR

Baidu Qianfan-OCR 4B (Qwen3-4B + Qianfan-ViT), Apache 2.0, 192 langs. Layout-as-Thought.

Editorial79.82026Source
6dots-ocr-3bEditorial79.12025Source
7mistral-ocr-3

Estimated based on 74% win rate vs OCR 2

Editorial782025Source
8marker-1.10.0Editorial76.52025Source
9marker-1.10.1Editorial76.12025Source
10deepseek-ocrEditorial75.72025Source
11mineru-2.5Editorial75.22025Source
12mistral-ocr-apiEditorial722025Source
13gpt-4o-anchored

GPT-4o with anchored prompting

Editorial69.92025Source
14nanonets-ocr2-3bEditorial69.52025Source
15gemini-flash-2Editorial63.82025Source

multi-column

Higher is better

RankModelSourceScoreYearPaper
1chandra-ocr-0.1.0

Multi-column document parsing

Editorial81.22025Source

old-scans-math

Higher is better

RankModelSourceScoreYearPaper
1chandra-ocr-0.1.0

Mathematical notation in old scans. #1, leads by 5.4 points

Editorial80.32025Source
2olmocr-v0.3.0

#2 on math in old scans

Editorial79.92025Source

old-scans

Higher is better

RankModelSourceScoreYearPaper
1chandra-ocr-0.1.0

Old scan recognition. #1 (GPT-4o: 40.7)

Editorial50.42025Source
2gpt-4o

#2 on old scans. Chandra leads by 9.7 points

Editorial40.72025Source

Submit a Result