Processing thousands of pages with equations or multilingual text? Need fast, cheap OCR via API without managing infrastructure? Mistral OCR outputs Markdown at $0.001/page.
Tested on December 17, 2025: 9-page PDF processed in 9.04 seconds, 34,656 chars output. Download output
| Category | Mistral OCR | GPT-5.4 | Google Doc AI | Azure OCR |
|---|---|---|---|---|
| Overall | 94.9% | ~85% | 83.4% | 89.5% |
| Scanned Docs | 98.96% | ~95% | 96.15% | ~94% |
| Math/Equations | 94.29% | ~88% | ~75% | ~70% |
| Multilingual | 89.55% | 86.0% | ~82% | 87.52% |
Source: Mistral's internal benchmarks. Independent verification pending.
We recommend testing on your specific document types before production deployment.
Stop picking the wrong OCR model
Monthly OCR benchmark update — new models, price changes, accuracy deltas. Free.
pip install mistralaifrom mistralai import Mistral
client = Mistral(api_key=os.environ["MISTRAL_API_KEY"])
# Process a PDF from URL
ocr_response = client.ocr.process(
model="mistral-ocr-latest",
document={
"type": "document_url",
"document_url": "https://arxiv.org/pdf/2201.04234"
}
)
# Get markdown output
for page in ocr_response.pages:
print(page.markdown)import base64
def encode_file(file_path):
with open(file_path, "rb") as f:
return base64.b64encode(f.read()).decode('utf-8')
client = Mistral(api_key=os.environ["MISTRAL_API_KEY"])
# Process local PDF
base64_pdf = encode_file("invoice.pdf")
ocr_response = client.ocr.process(
model="mistral-ocr-latest",
document={
"type": "document_url",
"document_url": f"data:application/pdf;base64,{base64_pdf}"
}
)| Service | Cost per 1000 Pages | Type |
|---|---|---|
| Mistral OCR | $1.00 | API |
| Mistral OCR (batch) | $0.50 | API |
| GPT-5.4 Vision | ~$5-15 | API |
| Google Document AI | $1.50 | API |
| Docling | $0 (self-hosted) | Open Source |
Same document: Docling paper (arxiv:2408.09869), tested December 17, 2025
| Metric | Mistral OCR | Docling |
|---|---|---|
| Processing Time | 9.04 seconds | 34.95 seconds |
| Output Size | 34,656 chars | 33,201 chars |
| Pages Processed | 9 pages | 10 pages |
| Cost (this test) | ~$0.009 | $0.00 |
| Data Privacy | Sent to Mistral | Fully local |
| Table Export | Markdown only | DataFrame/CSV |
| License | Proprietary API | MIT (open source) |
Mistral is ~4x faster but costs money. Docling is free but requires local compute. Download test data
Stop picking the wrong OCR model
Monthly OCR benchmark update — new models, price changes, accuracy deltas. Free.