Home / OCR / Benchmarks / hotpotqa

hotpotqa

Unknown

OCR benchmark

2
Total Results
2
Models Tested
1
Metrics
2025-12-19
Last Updated

f1

Higher is better

Rank Model Score Source
1 gpt-4o

Multi-hop question answering requiring reasoning over Wikipedia.

71.3 arxiv-paper
2 claude-35-sonnet 68.5 arxiv-paper