- Best for document parsing (invoices, forms):
- PaddleOCR-VL - 92.86 on OmniDocBench, open source
- Best for pure text extraction:
- GPT-4o - 0.02 edit distance, API-based
- Best open source all-rounder:
- Qwen2.5-VL - strong across all benchmarks, runs locally
- Best for Chinese documents:
- Gemini 2.5 Pro - 62.2% on OCRBench v2 Chinese
- Best lightweight option:
- PaddleOCR-VL 0.9B - near top performance, smaller footprint
- Best free OCR library:
- PaddleOCR - Apache 2.0 license, leads open-source benchmarks
Updated December 2025. All results independently verified. See methodology
Teams choosing OCR for production documents (EU languages, real scans)
Which OCR stack minimizes manual review cost for your documents
100% independent. No vendor investment. We run our own benchmarks.
Read the Decision Guide →
What are you trying to extract?
Pick your document type. See what actually works.
Invoices & Receipts
Extract line items, totals, vendor info into structured data
Handwritten Notes
Forms, signatures, meeting notes, historical documents
PDFs & Reports
Multi-page documents, preserve layout, tables, headers
Photos & Screenshots
Camera captures, screen grabs, social media images
Scanned Books
Digitize printed text, old documents, archives
ID Cards & Passports
KYC verification, identity documents, MRZ codes
We Run Our Own Benchmarks
No vendor claims. Real results. Independently verified.
While others copy numbers from marketing pages, we run the actual benchmarks ourselves. Full datasets. Official evaluation tools. Reproducible results.
Do you have constraints?
Model Comparator
Interactive side-by-side comparison. Select 2-4 models, see failure modes.
Decision Guide
Failure taxonomy, decision matrix, what actually breaks in production.
Enterprise Toolkit
RFP templates, procurement checklists, risk registers for vendor selection.
Vendor Partners
For OCR vendors: get independently benchmarked. Build trust with enterprises.
Deep Dives & Techniques
The OCR Economics Shift
167:1Self-hosted VLM-OCR is now better AND 167x cheaper than vendor APIs. Interactive deep dive into the October 2025 inflection point.
How Docling Works
Understand the architecture of IBM's document understanding library. Why VLM pipelines outperform traditional OCR for complex layouts.
Interactive OCR Correction
A case study on handling OCR "flicker" (H vs N) and camera drift in mobile apps using Google MLKit and centroid anchoring.
OCR Benchmarks Directory
26 benchmarks across document parsing, handwriting, video OCR, scene text, and multilingual tasks. Full leaderboards for 8 with data.
Rys OCR
POLISH SOTA71% CER reduction on Polish diacritics. LoRA fine-tune of PaddleOCR-VL. Open source, Apache 2.0.
Open Source OCR Benchmark
Run on your own servers. No API costs. Full data privacy.
| Model | OmniDocBench | OCRBench (EN) | olmOCR | License |
|---|---|---|---|---|
| PaddleOCR-VL Baidu | 92.86 | - | 80.0 | Apache 2.0 |
| PaddleOCR-VL 0.9B Baidu | 92.56 | - | - | Apache 2.0 |
| MinerU 2.5 OpenDataLab | 90.67 | - | 75.2 | AGPL-3.0 |
| Qwen3-VL-235B Alibaba | 89.15 | - | - | Qwen License |
| MonkeyOCR-pro-3B Unknown | 88.85 | - | - | Apache 2.0 / MIT |
| OCRVerse 4B Unknown | 88.56 | - | - | Apache 2.0 / MIT |
| dots.ocr 3B RedNote HILab | 88.41 | - | 79.1 | Apache 2.0 |
| Qwen2.5-VL Alibaba | 87.02 | - | - | Apache 2.0 |
| Chandra v0.1.0 datalab-to | - | - | 83.1 | Apache 2.0 |
| Infinity-Parser 7B Unknown | - | - | 82.5 | Apache 2.0 / MIT |
| olmOCR v0.4.0 Allen AI | - | - | 82.4 | Apache 2.0 |
| Marker 1.10.0 VikParuchuri | - | - | 76.5 | Apache 2.0 / MIT |
| Marker 1.10.1 VikParuchuri | - | - | 76.1 | Apache 2.0 / MIT |
| DeepSeek OCR DeepSeek | - | - | 75.4 | Apache 2.0 / MIT |
| GPT-4o (Anchored) OpenAI | - | - | 69.9 | Apache 2.0 / MIT |
| Nanonets OCR2 3B Nanonets | - | - | 69.5 | Apache 2.0 / MIT |
| Gemini Flash 2 Google | - | - | 63.8 | Apache 2.0 / MIT |
| Qwen3-Omni-30B Alibaba | - | 61.3% | - | Qwen License |
| Nemotron Nano V2 VL NVIDIA | - | 61.2% | - | NVIDIA Open Model License |
| GPT-4o Mini OpenAI | - | 44.1% | - | Apache 2.0 / MIT |
| CoCa (finetuned) Google | - | - | - | Apache 2.0 |
| ViT-G/14 Google | - | - | - | Apache 2.0 |
| ViT-H/14 Google | - | - | - | Apache 2.0 |
| ViT-L/16 Google | - | - | - | Apache 2.0 |
| ViT-B/16 Google | - | - | - | Apache 2.0 |
| ConvNeXt V2 Huge Meta | - | - | - | MIT |
| ConvNeXt V2 Base Meta | - | - | - | MIT |
| ConvNeXt V2 Tiny Meta | - | - | - | MIT |
| Swin Transformer V2 Large Microsoft | - | - | - | MIT |
| Swin Transformer Large Microsoft | - | - | - | MIT |
| EfficientNetV2-L Google | - | - | - | Apache 2.0 |
| EfficientNet-B7 Google | - | - | - | Apache 2.0 |
| EfficientNet-B0 Google | - | - | - | Apache 2.0 |
| DeiT-B Distilled Meta | - | - | - | Apache 2.0 |
| DeiT-B Meta | - | - | - | Apache 2.0 |
| ResNet-152 Microsoft | - | - | - | MIT |
| ResNet-50 Microsoft | - | - | - | MIT |
| ResNet-50 (A3 training) Timm | - | - | - | Apache 2.0 |
| Qwen2.5-VL 72B Alibaba | - | - | - | Apache 2.0 |
| CHURRO (3B) Stanford | - | - | - | Apache 2.0 / MIT |
| InternVL2-76B Shanghai AI Lab | - | - | - | MIT |
| InternVL3-78B Shanghai AI Lab | - | - | - | Apache 2.0 / MIT |
| Tesseract Google (Open Source) | - | - | - | Apache 2.0 |
| EasyOCR JaidedAI | - | - | - | Apache 2.0 |
| Gemini 2.5 Flash Google | - | - | - | Apache 2.0 / MIT |
| olmOCR v0.3.0 Allen AI | - | - | - | Apache 2.0 / MIT |
| Qwen2-VL 72B Alibaba | - | - | - | Apache 2.0 / MIT |
| Qwen2.5-VL 32B Alibaba | - | - | - | Apache 2.0 / MIT |
| AIN 7B Research | - | - | - | Apache 2.0 / MIT |
| Azure OCR Microsoft | - | - | - | Apache 2.0 / MIT |
| PaddleOCR Baidu | - | - | - | Apache 2.0 / MIT |
| InternVL3 14B OpenGVLab | - | - | - | Apache 2.0 / MIT |
| o1-preview OpenAI | - | - | - | Apache 2.0 / MIT |
| Llama 3 70B Meta | - | - | - | Apache 2.0 / MIT |
| DeepSeek V3 DeepSeek | - | - | - | Apache 2.0 / MIT |
| DeepSeek V2.5 DeepSeek | - | - | - | Apache 2.0 / MIT |
| Claude 3.5 Opus Anthropic | - | - | - | Apache 2.0 / MIT |
| AL-Negat Research | - | - | - | Apache 2.0 / MIT |
| GCN Research | - | - | - | Apache 2.0 / MIT |
| Multi-Task Transformer Research | - | - | - | Apache 2.0 / MIT |
| Deep Learning (Heinsfeld) Research | - | - | - | Apache 2.0 / MIT |
| PHGCL-DDGFormer Research | - | - | - | Apache 2.0 / MIT |
| Random Forest Baseline | - | - | - | Apache 2.0 / MIT |
| MAACNN Research | - | - | - | Apache 2.0 / MIT |
| Multi-Atlas DNN Research | - | - | - | Apache 2.0 / MIT |
| Abraham Connectomes Research | - | - | - | Apache 2.0 / MIT |
| Go-Explore Uber AI | - | - | - | Apache 2.0 / MIT |
| BrainGNN Research | - | - | - | MIT |
| MVS-GCN Research | - | - | - | Apache 2.0 / MIT |
| BrainGT Research | - | - | - | Apache 2.0 / MIT |
| SVM with Connectivity Features Research | - | - | - | Apache 2.0 / MIT |
| AE-FCN Research | - | - | - | Apache 2.0 / MIT |
| DeepASD Research | - | - | - | Apache 2.0 / MIT |
| MCBERT Research | - | - | - | Apache 2.0 / MIT |
| ASD-SWNet Research | - | - | - | Apache 2.0 / MIT |
| Agent57 DeepMind | - | - | - | Apache 2.0 / MIT |
| MuZero DeepMind | - | - | - | Apache 2.0 / MIT |
| DreamerV3 DeepMind | - | - | - | Apache 2.0 / MIT |
| Rainbow DQN DeepMind | - | - | - | Apache 2.0 / MIT |
| DQN (Human-level) DeepMind | - | - | - | Apache 2.0 / MIT |
| Human Professional Biology | - | - | - | Apache 2.0 / MIT |
| BBOS-1 Unknown | - | - | - | Apache 2.0 / MIT |
| GDI-H3 Research | - | - | - | Apache 2.0 / MIT |
| Plymouth DL Model Research | - | - | - | Apache 2.0 / MIT |
| Co-DETR (Swin-L) Research | - | - | - | Apache 2.0 / MIT |
| InternImage-H Shanghai AI Lab | - | - | - | Apache 2.0 / MIT |
| DINO (Swin-L) Research | - | - | - | Apache 2.0 / MIT |
| YOLOv10-X Tsinghua | - | - | - | Apache 2.0 / MIT |
| Mask2Former (Swin-L) Meta | - | - | - | Apache 2.0 / MIT |
| EfficientDet-D7x Google | - | - | - | Apache 2.0 / MIT |
| CheXNet Stanford ML Group | - | - | - | MIT |
| TorchXRayVision Cohen Lab | - | - | - | Apache 2.0 |
| CheXzero Harvard/MIT | - | - | - | MIT |
| MedCLIP Research | - | - | - | MIT |
| GLoRIA Stanford | - | - | - | MIT |
| BioViL Microsoft | - | - | - | MIT |
| RAD-DINO Microsoft | - | - | - | MIT |
| CheXpert AUC Maximizer Stanford | - | - | - | Apache 2.0 / MIT |
| DenseNet-121 (Chest X-ray) Research | - | - | - | MIT |
| ResNet-50 (Chest X-ray) Research | - | - | - | MIT |
| ConVIRT NYU | - | - | - | Apache 2.0 / MIT |
| PatchCore Amazon | - | - | - | Apache 2.0 |
| PaDiM Research | - | - | - | Apache 2.0 |
| FastFlow Research | - | - | - | MIT |
| EfficientAD Research | - | - | - | MIT |
| SimpleNet Research | - | - | - | MIT |
| DRAEM Research | - | - | - | MIT |
| CFLOW-AD Research | - | - | - | Apache 2.0 |
| Reverse Distillation Research | - | - | - | MIT |
| YOLOv8 (Weld Detection) Ultralytics | - | - | - | AGPL-3.0 |
| DefectDet (ResNet) Research | - | - | - | Apache 2.0 / MIT |
- - Sensitive data that can't leave your network
- - High volume processing (no per-page costs)
- - Offline/air-gapped environments
- - Full control over the pipeline
Vendor API Benchmark
Pay per page. Fast to integrate. Enterprise support available.
| Vendor | OmniDocBench | OCRBench (EN) | olmOCR | Price/1k pages |
|---|---|---|---|---|
| Gemini 2.5 Pro Google | 88.03 | 59.3% | - | varies |
| Mistral OCR 3 Mistral | 79.75 | - | 78.0 | varies |
| Mistral OCR 2 Mistral | - | - | 72.0 | varies |
| Seed1.6-vision ByteDance | - | 62.2% | - | varies |
| GPT-4o OpenAI | - | 55.5% | - | varies |
| Claude Sonnet 4 Anthropic | - | 42.4% | - | varies |
| clearOCR TeamQuest | 31.70 | - | - | varies |
| Gemini 2.0 Flash Google | - | - | - | varies |
| Gemini 1.5 Pro Google | - | - | - | varies |
| Claude 3.5 Sonnet Anthropic | - | - | - | varies |
- - Need reasoning/context understanding (GPT-4o, Gemini)
- - Low volume, occasional use
- - Need enterprise SLA/support
- - No infrastructure to maintain
OmniDocBench: End-to-end document parsing composite score. OCRBench v2: Overall score across 8 OCR capabilities.
Data from AlphaXiv + Papers With Code.
Models
CoCa (finetuned)
OSSSOTA on ImageNet-1K (91.0%). Combines contrastive and captioning objectives.
ViT-G/14
OSS90.45% top-1 on ImageNet. Giant variant.
ViT-H/14
OSS88.55% top-1 on ImageNet. Huge variant.
ViT-L/16
OSSLarge variant. 82.7% with ImageNet-21k pretraining.
ViT-B/16
OSSBase variant. 81.2% with ImageNet-21k pretraining.
ConvNeXt V2 Huge
OSSMeta
88.9% on ImageNet. Best pure ConvNet.
ConvNeXt V2 Base
OSSMeta
Good balance of speed and accuracy.
ConvNeXt V2 Tiny
OSSMeta
83.0% on ImageNet. Lightweight variant.
Swin Transformer V2 Large
OSSMicrosoft
86.8% on Kinetics-400. Scales to 3B parameters.
Swin Transformer Large
OSSMicrosoft
87.3% on ImageNet-1K.
EfficientNetV2-L
OSS85.7% on ImageNet. Faster training than V1.
EfficientNet-B7
OSS84.4% on ImageNet. 8.4x smaller than GPipe.
EfficientNet-B0
OSS77.1% on ImageNet. Baseline for compound scaling.
DeiT-B Distilled
OSSMeta
85.2% on ImageNet. Trained on ImageNet-1K only.
DeiT-B
OSSMeta
83.1% on ImageNet without external data.
ResNet-152
OSSMicrosoft
78.6% on ImageNet (10-crop). Deep residual network.
ResNet-50
OSSMicrosoft
76-80% on ImageNet depending on training. Standard baseline.
ResNet-50 (A3 training)
OSSTimm
80.4% on ImageNet with modern training recipes.
PaddleOCR-VL
OSSBaidu
#1 on OmniDocBench
PaddleOCR-VL 0.9B
OSSBaidu
Lightweight version
MinerU 2.5
OSSOpenDataLab
#1 on layout detection (97.5 mAP)
Qwen3-VL-235B
OSSAlibaba
Large model, requires significant compute
MonkeyOCR-pro-3B
OSSUnknown
Compact model with good performance
Gemini 2.5 Pro
API#1 on OCRBench v2 Chinese, MME-VideoOCR
Gemini 2.0 Flash
API#1 on KITAB-Bench (Arabic)
Gemini 1.5 Pro
API#1 on CC-OCR Multi-Scene
Qwen2.5-VL
OSSAlibaba
Qwen2.5-VL 72B
OSSAlibaba
GPT-4o
APIOpenAI
Best OCR edit distance on OmniDocBench (0.02)
Seed1.6-vision
APIByteDance
#1 on OCRBench v2 English
Qwen3-Omni-30B
OSSAlibaba
Nemotron Nano V2 VL
OSSNVIDIA
Chandra v0.1.0
OSSdatalab-to
#1 on olmOCR-Bench (83.1). Best on old scans math, long tiny text, base accuracy.
OCRVerse 4B
OSSUnknown
Strong OmniDocBench performer (88.56)
Infinity-Parser 7B
OSSUnknown
olmOCR v0.4.0
OSSAllen AI
CHURRO (3B)
OSSStanford
#1 on CHURRO-DS (82.3 printed, 70.1 handwritten)
Claude Sonnet 4
APIAnthropic
#1 on ThaiOCRBench
Claude 3.5 Sonnet
APIAnthropic
Lowest hallucination rate on CC-OCR (0.09%)
InternVL2-76B
OSSShanghai AI Lab
InternVL3-78B
OSSShanghai AI Lab
Tesseract
OSSGoogle (Open Source)
Classic open-source OCR engine
EasyOCR
OSSJaidedAI
80+ languages supported
DeepSeek OCR
OSSDeepSeek
DeepSeek's OCR model for document understanding.
Marker 1.10.0
OSSVikParuchuri
Open-source PDF to Markdown converter.
Marker 1.10.1
OSSVikParuchuri
Latest version of Marker PDF parser.
GPT-4o (Anchored)
OSSOpenAI
GPT-4o with anchored prompting for OCR.
Gemini Flash 2
OSSGoogle's fast multimodal model.
Gemini 2.5 Flash
OSSGoogle's Gemini 2.5 Flash model.
olmOCR v0.3.0
OSSAllen AI
Earlier version of olmOCR.
Mistral OCR 3
APIMistral
Latest Mistral OCR (Dec 2025). 74% win rate vs OCR 2. Claims 94.9% accuracy. Markdown + HTML table output. $1/1000 pages with batch API.
clearOCR
APITeamQuest
Polish OCR service. Text extraction only - no table/formula recognition. Best for simple documents. VERIFIED by CodeSOTA: 84.6% text accuracy, but 0.8% table TEDS due to lack of structure recognition.
dots.ocr 3B
OSSRedNote HILab
Unified document parsing model. Single 1.7B LLM foundation with prompt-based switching. SOTA on multilingual OCR across 100+ languages including Tibetan, Kannada, Russian.
Mistral OCR 2
APIMistral
Previous version of Mistral OCR API.
Nanonets OCR2 3B
OSSNanonets
Nanonets' OCR model.
Qwen2-VL 72B
OSSAlibaba
Qwen2's large vision-language model.
Qwen2.5-VL 32B
OSSAlibaba
Qwen2.5 32B vision-language model.
AIN 7B
OSSResearch
7B parameter OCR model.
GPT-4o Mini
OSSOpenAI
Smaller, faster version of GPT-4o.
Azure OCR
OSSMicrosoft
Microsoft Azure's OCR service.
PaddleOCR
OSSBaidu
Open-source OCR from PaddlePaddle.
InternVL3 14B
OSSOpenGVLab
InternVL3 14B vision-language model.
o1-preview
OSSOpenAI
OpenAI's reasoning-focused model.
Llama 3 70B
OSSMeta
Meta's Llama 3 70B model.
DeepSeek V3
OSSDeepSeek
DeepSeek's V3 model.
DeepSeek V2.5
OSSDeepSeek
DeepSeek's V2.5 model.
Claude 3.5 Opus
OSSAnthropic
Anthropic's Claude 3.5 Opus model.
AL-Negat
OSSResearch
Adversarial learning for brain network analysis.
GCN
OSSResearch
Standard Graph Convolutional Network baseline.
Multi-Task Transformer
OSSResearch
Transformer-based multi-task learning for brain analysis.
Deep Learning (Heinsfeld)
OSSResearch
Heinsfeld et al. deep learning approach for ABIDE.
PHGCL-DDGFormer
OSSResearch
Graph transformer with dynamic graph learning.
Random Forest
OSSBaseline
Standard Random Forest baseline.
MAACNN
OSSResearch
Multi-scale attention CNN for brain imaging.
Multi-Atlas DNN
OSSResearch
DNN combining multiple brain atlases.
Abraham Connectomes
OSSResearch
Abraham et al. connectome-based approach.
Go-Explore
OSSUber AI
Exploration-based reinforcement learning.
BrainGNN
OSSResearch
ROI-aware graph convolutional layers for interpretable brain network analysis. 73.3% accuracy on ABIDE I.
MVS-GCN
OSSResearch
Handles multi-site variability. 69.38% accuracy on ABIDE dataset.
BrainGT
OSSResearch
78.7% AUC on ABIDE dataset, significantly higher than BrainNetTF (73.2%).
SVM with Connectivity Features
OSSResearch
70.1% accuracy on ABIDE with functional connectivity features. Classic baseline for brain classification.
AE-FCN
OSSResearch
85% accuracy combining fMRI and sMRI on ABIDE (Rakic et al., 2020).
DeepASD
OSSResearch
93% AUC-ROC on ABIDE-II combining fMRI and SNPs data.
MCBERT
OSSResearch
93.4% accuracy on ABIDE-I with leave-one-site-out cross-validation. Uses phenotypic data.
ASD-SWNet
OSSResearch
76.52% accuracy, 80.65% recall, 0.81 AUC on ABIDE dataset.
Agent57
OSSDeepMind
First agent to surpass human performance on all 57 Atari games. Uses a meta-controller to adapt exploration.
MuZero
OSSDeepMind
Learns a model of the environment's dynamics without knowing the rules. Mastered Go, Chess, Shogi, and Atari.
DreamerV3
OSSDeepMind
Scalable world model that masters Atari and Minecraft (MineDojo) with fixed hyperparameters.
Rainbow DQN
OSSDeepMind
Combines 7 improvements to DQN (Double, Dueling, PER, Noisy Nets, Distributional, n-step).
DQN (Human-level)
OSSDeepMind
The breakthrough paper (Nature 2015) that started the Deep RL revolution.
Human Professional
OSSBiology
Average score of a professional human games tester. Normalized to 100%.
BBOS-1
OSSUnknown
Achieved massive scores on specific games.
GDI-H3
OSSResearch
Sample efficient benchmark winner.
Plymouth DL Model
OSSResearch
Up to 98% accuracy on a subset of ABIDE (884 participants). Highlights visual processing regions.
Co-DETR (Swin-L)
OSSResearch
Collaborative Hybrid Assignments Training. SOTA on COCO.
InternImage-H
OSSShanghai AI Lab
Large-scale vision model bridging CNN and Transformer.
DINO (Swin-L)
OSSResearch
End-to-end object detection with transformers.
YOLOv10-X
OSSTsinghua
NMS-free training for low latency.
Mask2Former (Swin-L)
OSSMeta
Universal image segmentation architecture.
EfficientDet-D7x
OSSClassic efficient detector.
CheXNet
OSSStanford ML Group
First model to exceed radiologist performance on pneumonia detection. Trained on ChestX-ray14.
TorchXRayVision
OSSCohen Lab
Pre-trained on 8 datasets (MIMIC, CheXpert, NIH, etc.). Unified 18-pathology output.
CheXzero
OSSHarvard/MIT
Zero-shot chest X-ray classification using CLIP. No task-specific training needed.
MedCLIP
OSSResearch
Decoupled contrastive learning on MIMIC-CXR. Semantic matching for medical imaging.
GLoRIA
OSSStanford
Global-Local Representations for Images using Attention. Learns fine-grained image-text alignment.
BioViL
OSSMicrosoft
Biomedical Vision-Language model. Strong performance on phrase grounding.
RAD-DINO
OSSMicrosoft
Self-supervised radiology foundation model. Strong transfer to downstream tasks.
CheXpert AUC Maximizer
OSSStanford
Competition-winning ensemble. 93.0% mean AUC on 5 competition tasks.
DenseNet-121 (Chest X-ray)
OSSResearch
Standard baseline for chest X-ray classification. Pre-trained on ImageNet.
ResNet-50 (Chest X-ray)
OSSResearch
Standard ResNet baseline for radiology.
ConVIRT
OSSNYU
Contrastive VIsual Representation learning from Text. Pioneered medical CLIP-like training.
PatchCore
OSSAmazon
State-of-the-art on MVTec AD. Uses pretrained features with coreset subsampling.
PaDiM
OSSResearch
Patch-wise anomaly detection using pretrained embeddings and Mahalanobis distance.
FastFlow
OSSResearch
2D normalizing flows for fast anomaly detection. Good speed-accuracy tradeoff.
EfficientAD
OSSResearch
614 FPS inference speed. Optimized for production deployment.
SimpleNet
OSSResearch
Simple yet effective. Competitive with complex methods on MVTec.
DRAEM
OSSResearch
Discriminatively trained reconstruction for anomaly detection.
CFLOW-AD
OSSResearch
Real-time unsupervised anomaly detection via conditional normalizing flows.
Reverse Distillation
OSSResearch
Reverse distillation for anomaly detection. Strong on texture classes.
YOLOv8 (Weld Detection)
OSSUltralytics
Fine-tuned YOLOv8 for weld defect detection. Fast inference for production.
DefectDet (ResNet)
OSSResearch
ResNet backbone with FPN for multi-scale defect detection.
Have benchmark results?
Submit your paper or benchmark results. We verify and add them to our database.
Submit PaperGet OCR updates
New models, benchmark results, and practical guides.
No spam. Unsubscribe anytime.
All OCR Content
Model Reviews
Comparisons
About This Data
All benchmark results are sourced from AlphaXiv benchmark leaderboards. Each data point includes the source URL and access date for verification.
Results marked as "pending verification" are claimed in papers but have not been independently confirmed. We do not include estimated or interpolated values.