Best OCR Models 2026
Compare 158+ models across 86 benchmarks. Open source vs vendor APIs. Real results from OmniDocBench, OCRBench v2, and olmOCR — independently verified.
158+
Models Tracked
86
Benchmarks
349
Verified Results
92.86
SOTA (OmniDoc)
Quick Answers
92.86 OmniDocBench — open source
0.02 edit distance — API
88.41 composite — 100+ languages
62.2% OCRBench v2 Chinese
Near-top accuracy, tiny footprint
Apache 2.0, leads all OS benchmarks
What are you trying to extract?
Pick your document type. See what actually works.
Invoices & Receipts
Line items, totals, vendor info → structured data
Handwritten Notes
Forms, signatures, meeting notes, historical docs
PDFs & Reports
Multi-page documents, layout, tables, headers
Photos & Screenshots
Camera captures, screen grabs, social media
Scanned Books
Digitize printed text, old documents, archives
ID Cards & Passports
KYC verification, identity documents, MRZ codes
Performance at a Glance
Visual comparison of accuracy, cost, and cross-benchmark coverage.
Accuracy — Top 10 Models (OmniDocBench)

Cost — Price per 1,000 Pages

Cross-Benchmark Heatmap — 12 Models × 8 Benchmarks

We Run Our Own Benchmarks
No vendor claims. Real results. Independently verified.
Full datasets. Official evaluation tools. Reproducible results. 1,355 images processed at $2.71 total cost.
Decision Tools
Model Comparator
Interactive side-by-side. Select 2–4 models, see failure modes.
Decision Guide
Failure taxonomy, decision matrix, what breaks in production.
Enterprise Toolkit
RFP templates, procurement checklists, vendor risk registers.
Vendor Partners
Get independently benchmarked. Build enterprise trust.
Open Source OCR Benchmark
Run on your own servers. No API costs. Full data privacy.
| Model | OmniDocBench | OCRBench (EN) | olmOCR | License |
|---|---|---|---|---|
| PaddleOCR-VL Baidu | 92.86 | — | 80.0 | Apache 2.0 |
| PaddleOCR-VL 0.9B Baidu | 92.56 | — | — | Apache 2.0 |
| MinerU 2.5 OpenDataLab | 90.67 | — | 75.2 | AGPL-3.0 |
| Qwen3-VL-235B Alibaba | 89.15 | — | — | Qwen License |
| MonkeyOCR-pro-3B Unknown | 88.85 | — | — | Apache 2.0 / MIT |
| OCRVerse 4B Unknown | 88.56 | — | — | Apache 2.0 / MIT |
| dots.ocr 3B RedNote HILab | 88.41 | — | 79.1 | Apache 2.0 |
| Qwen2.5-VL Alibaba | 87.02 | — | — | Apache 2.0 |
| Chandra v0.1.0 datalab-to | — | — | 83.1 | Apache 2.0 |
| Infinity-Parser 7B Unknown | — | — | 82.5 | Apache 2.0 / MIT |
| olmOCR v0.4.0 Allen AI | — | — | 82.4 | Apache 2.0 |
| Marker 1.10.0 VikParuchuri | — | — | 76.5 | Apache 2.0 / MIT |
| Marker 1.10.1 VikParuchuri | — | — | 76.1 | Apache 2.0 / MIT |
| DeepSeek OCR DeepSeek | — | — | 75.4 | Apache 2.0 / MIT |
| GPT-4o (Anchored) OpenAI | — | — | 69.9 | Apache 2.0 / MIT |
| Nanonets OCR2 3B Nanonets | — | — | 69.5 | Apache 2.0 / MIT |
| Gemini Flash 2 Google | — | — | 63.8 | Apache 2.0 / MIT |
| Qwen3-Omni-30B Alibaba | — | 61.3% | — | Qwen License |
| Nemotron Nano V2 VL NVIDIA | — | 61.2% | — | NVIDIA Open Model License |
| GPT-4o Mini OpenAI | — | 44.1% | — | Apache 2.0 / MIT |
| CoCa (finetuned) Google | — | — | — | Apache 2.0 |
| ViT-G/14 Google | — | — | — | Apache 2.0 |
| ViT-H/14 Google | — | — | — | Apache 2.0 |
| ViT-L/16 Google | — | — | — | Apache 2.0 |
| ViT-B/16 Google | — | — | — | Apache 2.0 |
| ConvNeXt V2 Huge Meta | — | — | — | MIT |
| ConvNeXt V2 Base Meta | — | — | — | MIT |
| ConvNeXt V2 Tiny Meta | — | — | — | MIT |
| Swin Transformer V2 Large Microsoft | — | — | — | MIT |
| Swin Transformer Large Microsoft | — | — | — | MIT |
| EfficientNetV2-L Google | — | — | — | Apache 2.0 |
| EfficientNet-B7 Google | — | — | — | Apache 2.0 |
| EfficientNet-B0 Google | — | — | — | Apache 2.0 |
| DeiT-B Distilled Meta | — | — | — | Apache 2.0 |
| DeiT-B Meta | — | — | — | Apache 2.0 |
| ResNet-152 Microsoft | — | — | — | MIT |
| ResNet-50 Microsoft | — | — | — | MIT |
| ResNet-50 (A3 training) Timm | — | — | — | Apache 2.0 |
| Qwen2.5-VL 72B Alibaba | — | — | — | Apache 2.0 |
| CHURRO (3B) Stanford | — | — | — | Apache 2.0 / MIT |
| InternVL2-76B Shanghai AI Lab | — | — | — | MIT |
| InternVL3-78B Shanghai AI Lab | — | — | — | Apache 2.0 / MIT |
| Tesseract Google (Open Source) | — | — | — | Apache 2.0 |
| EasyOCR JaidedAI | — | — | — | Apache 2.0 |
| Gemini 2.5 Flash Google | — | — | — | Apache 2.0 / MIT |
| olmOCR v0.3.0 Allen AI | — | — | — | Apache 2.0 / MIT |
| Qwen2-VL 72B Alibaba | — | — | — | Apache 2.0 / MIT |
| Qwen2.5-VL 32B Alibaba | — | — | — | Apache 2.0 / MIT |
| AIN 7B Research | — | — | — | Apache 2.0 / MIT |
| Azure OCR Microsoft | — | — | — | Apache 2.0 / MIT |
| PaddleOCR Baidu | — | — | — | Apache 2.0 / MIT |
| InternVL3 14B OpenGVLab | — | — | — | Apache 2.0 / MIT |
| o1-preview OpenAI | — | — | — | Apache 2.0 / MIT |
| Llama 3 70B Meta | — | — | — | Apache 2.0 / MIT |
| DeepSeek V3 DeepSeek | — | — | — | Apache 2.0 / MIT |
| DeepSeek V2.5 DeepSeek | — | — | — | Apache 2.0 / MIT |
| Claude 3.5 Opus Anthropic | — | — | — | Apache 2.0 / MIT |
| AL-Negat Research | — | — | — | Apache 2.0 / MIT |
| GCN Research | — | — | — | Apache 2.0 / MIT |
| Multi-Task Transformer Research | — | — | — | Apache 2.0 / MIT |
| Deep Learning (Heinsfeld) Research | — | — | — | Apache 2.0 / MIT |
| PHGCL-DDGFormer Research | — | — | — | Apache 2.0 / MIT |
| Random Forest Baseline | — | — | — | Apache 2.0 / MIT |
| MAACNN Research | — | — | — | Apache 2.0 / MIT |
| Multi-Atlas DNN Research | — | — | — | Apache 2.0 / MIT |
| Abraham Connectomes Research | — | — | — | Apache 2.0 / MIT |
| Go-Explore Uber AI | — | — | — | Apache 2.0 / MIT |
| BrainGNN Research | — | — | — | MIT |
| MVS-GCN Research | — | — | — | Apache 2.0 / MIT |
| BrainGT Research | — | — | — | Apache 2.0 / MIT |
| SVM with Connectivity Features Research | — | — | — | Apache 2.0 / MIT |
| AE-FCN Research | — | — | — | Apache 2.0 / MIT |
| DeepASD Research | — | — | — | Apache 2.0 / MIT |
| MCBERT Research | — | — | — | Apache 2.0 / MIT |
| ASD-SWNet Research | — | — | — | Apache 2.0 / MIT |
| Agent57 DeepMind | — | — | — | Apache 2.0 / MIT |
| MuZero DeepMind | — | — | — | Apache 2.0 / MIT |
| DreamerV3 DeepMind | — | — | — | Apache 2.0 / MIT |
| Rainbow DQN DeepMind | — | — | — | Apache 2.0 / MIT |
| DQN (Human-level) DeepMind | — | — | — | Apache 2.0 / MIT |
| Human Professional Biology | — | — | — | Apache 2.0 / MIT |
| BBOS-1 Unknown | — | — | — | Apache 2.0 / MIT |
| GDI-H3 Research | — | — | — | Apache 2.0 / MIT |
| Plymouth DL Model Research | — | — | — | Apache 2.0 / MIT |
| Co-DETR (Swin-L) Research | — | — | — | Apache 2.0 / MIT |
| InternImage-H Shanghai AI Lab | — | — | — | Apache 2.0 / MIT |
| DINO (Swin-L) Research | — | — | — | Apache 2.0 / MIT |
| YOLOv10-X Tsinghua | — | — | — | Apache 2.0 / MIT |
| Mask2Former (Swin-L) Meta | — | — | — | Apache 2.0 / MIT |
| EfficientDet-D7x Google | — | — | — | Apache 2.0 / MIT |
| CheXNet Stanford ML Group | — | — | — | MIT |
| TorchXRayVision Cohen Lab | — | — | — | Apache 2.0 |
| CheXzero Harvard/MIT | — | — | — | MIT |
| MedCLIP Research | — | — | — | MIT |
| GLoRIA Stanford | — | — | — | MIT |
| BioViL Microsoft | — | — | — | MIT |
| RAD-DINO Microsoft | — | — | — | MIT |
| CheXpert AUC Maximizer Stanford | — | — | — | Apache 2.0 / MIT |
| DenseNet-121 (Chest X-ray) Research | — | — | — | MIT |
| ResNet-50 (Chest X-ray) Research | — | — | — | MIT |
| ConVIRT NYU | — | — | — | Apache 2.0 / MIT |
| PatchCore Amazon | — | — | — | Apache 2.0 |
| PaDiM Research | — | — | — | Apache 2.0 |
| FastFlow Research | — | — | — | MIT |
| EfficientAD Research | — | — | — | MIT |
| SimpleNet Research | — | — | — | MIT |
| DRAEM Research | — | — | — | MIT |
| CFLOW-AD Research | — | — | — | Apache 2.0 |
| Reverse Distillation Research | — | — | — | MIT |
| YOLOv8 (Weld Detection) Ultralytics | — | — | — | AGPL-3.0 |
| DefectDet (ResNet) Research | — | — | — | Apache 2.0 / MIT |
| DeepSeek-R1 DeepSeek | — | — | — | Apache 2.0 / MIT |
| Llama 3.1 405B Meta | — | — | — | Apache 2.0 / MIT |
| Llama 3.1 70B Meta | — | — | — | Apache 2.0 / MIT |
- • Sensitive data that can't leave your network
- • High volume processing (no per-page costs)
- • Offline / air-gapped environments
- • Full control over the pipeline
Vendor API Benchmark
Pay per page. Fast to integrate. Enterprise support available.
| Vendor | OmniDocBench | OCRBench (EN) | olmOCR | Price/1k pages |
|---|---|---|---|---|
| Gemini 2.5 Pro Google | 88.03 | 59.3% | — | varies |
| Mistral OCR 3 Mistral | 79.75 | — | 78.0 | varies |
| Mistral OCR 2 Mistral | — | — | 72.0 | varies |
| Seed1.6-vision ByteDance | — | 62.2% | — | varies |
| GPT-4o OpenAI | — | 55.5% | — | varies |
| Claude Sonnet 4 Anthropic | — | 42.4% | — | varies |
| clearOCR TeamQuest | 31.70 | — | — | varies |
| Gemini 2.0 Flash Google | — | — | — | varies |
| Gemini 1.5 Pro Google | — | — | — | varies |
| Claude 3.5 Sonnet Anthropic | — | — | — | varies |
| o1 OpenAI | — | — | — | varies |
| o1-mini OpenAI | — | — | — | varies |
| o3 OpenAI | — | — | — | varies |
| o3-mini OpenAI | — | — | — | varies |
| o4-mini OpenAI | — | — | — | varies |
| GPT-4.1 OpenAI | — | — | — | varies |
| GPT-4.5 Preview OpenAI | — | — | — | varies |
| Claude 3.7 Sonnet Anthropic | — | — | — | varies |
| Grok 2 xAI | — | — | — | varies |
| Claude 3 Opus Anthropic | — | — | — | varies |
| GPT-4 Turbo OpenAI | — | — | — | varies |
| Claude Sonnet 4.5 Anthropic | — | — | — | varies |
- • Need reasoning / context understanding (GPT-4o, Gemini)
- • Low volume, occasional use
- • Need enterprise SLA / support
- • No infrastructure to maintain
Cross-Benchmark Champions
Models that perform well across multiple OCR benchmarks — not just one.
Deep Dives & Techniques
The OCR Economics Shift
167:1Self-hosted VLM-OCR is now better AND 167× cheaper than vendor APIs. The October 2025 inflection point.
How Docling Works
Architecture of IBM's document understanding library. Why VLM pipelines outperform traditional OCR.
Interactive OCR Correction
Handling OCR “flicker” (H vs N) and camera drift in mobile apps. Google MLKit + centroid anchoring.
OCR Benchmarks Directory
26 benchmarks across document parsing, handwriting, video OCR, scene text, and multilingual tasks.
Rys OCR
POLISH SOTA71% CER reduction on Polish diacritics. LoRA fine-tune of PaddleOCR-VL. Apache 2.0.
You Know the Best OCR Model. Now Ship It.
3 questions, one recommendation, copy-paste code that runs in 10 minutes.
Run this OCR on your Mac — $25, one-time
Hardparse runs PaddleOCR-VL locally via Metal. No cloud, no subscription. Tables, formulas, 109 languages.
All OCR Content
Model Reviews
Comparisons
Frequently Asked Questions
What is the best OCR model in 2026?
Which OCR model has the best English text recognition?
Is open-source OCR better than paid APIs in 2026?
Which OCR is best for invoices and receipts?
How much does OCR cost per page?
Have benchmark results?
Submit your paper or results. We verify and add them to our database.
Submit PaperGet OCR updates
New models, benchmark results, and practical guides.
No spam. Unsubscribe anytime.
About This Data
All benchmark results sourced from AlphaXiv leaderboards, published papers, and our own independent verification. Each data point includes source URL and access date.
Results marked “pending verification” are claimed in papers but not independently confirmed. We do not include estimated or interpolated values.