Document Parsing

Parsing document structure and content

2
Datasets
75
Results
composite
Canonical metric
Canonical Benchmark

OmniDocBench

981 annotated PDF pages across 9 document categories. Tests end-to-end document parsing including text, tables, and formulas.

Primary metric: composite
View full leaderboard

Top 10

Leading models on OmniDocBench.

RankModellayout-mapYearSource
1
mineru-2.5
97.52025paper
2
GLM-OCR
94.62026paper
3
PaddleOCR-VL-1.5
94.52026paper
4
paddleocr-vl
93.52025paper
5
Qianfan-OCR
93.12026paper
6
paddleocr-vl
92.92025paper
7
paddleocr-vl-0.9b
92.62025paper
8
Qianfan-OCR
92.42026paper
9
mistral-ocr-3
91.62025paper
10
Qianfan-OCR
91.02026paper

All datasets

2 datasets tracked for this task.

Related tasks

Other tasks in Computer Vision.