Help Prioritize Research

234 benchmarks need research to determine if they're still relevant. Vote on which ones should be updated first.

How this works

1

Vote for benchmarks

Click the upvote button on benchmarks you want us to research

2

We research top voted

Using Exa and manual review to find latest SOTA results

3

Benchmarks get updated

Fresh data gets added, or benchmark marked as saturated/legacy

Top Voted

OmniDocBench v1.5Computer Vision/Document Parsing
61 resultsLatest: Mar 2026
57 resultsLatest: Oct 2023
inverse-textComputer Vision/Optical Character Recognition
34 resultsLatest: May 2023
X-Ray Weld Defect Detection DatasetIndustrial Inspection/Anomaly Detection
1 resultsNo paper dates
Mathematics Aptitude Test of HeuristicsReasoning/Mathematical Reasoning
46 resultsLatest: May 2026
coco-textComputer Vision/Scene Text Detection
33 resultsLatest: May 2023
videodb's-ocr-benchmark-public-collectionComputer Vision/Optical Character Recognition
15 resultsLatest: Feb 2025
SuperGLUENatural Language Processing/Text classification
8 resultsLatest: Jul 2024
Cityscapes DatasetComputer Vision/Semantic Segmentation
3 resultsLatest: Aug 2025
Open Graph BenchmarkGraphs/Node Classification
0 resultsNo paper dates

Computer Vision161

OmniDocBench v1.5/Document Parsing
61 resultsLatest: Mar 2026
inverse-text/Optical Character Recognition
34 resultsLatest: May 2023
coco-text/Scene Text Detection
33 resultsLatest: May 2023
videodb's-ocr-benchmark-public-collection/Optical Character Recognition
15 resultsLatest: Feb 2025
Cityscapes Dataset/Semantic Segmentation
3 resultsLatest: Aug 2025
Total-Text/Scene Text Detection
126 resultsLatest: Dec 2024
msra-td500/Scene Text Detection
79 resultsLatest: Oct 2024
icdar-2017-mlt/Scene Text Detection
54 resultsLatest: May 2022
33 resultsNo paper dates
dart/Optical Character Recognition
32 resultsLatest: Oct 2023
icdar2015/Optical Character Recognition
30 resultsLatest: Nov 2025
tabfact/Optical Character Recognition
30 resultsLatest: Dec 2024
Comprehensive Challenge OCR/General OCR Capabilities
28 resultsNo paper dates
iiit5k/Scene Text Recognition
21 resultsLatest: Aug 2023
sun-rgb-d/Optical Character Recognition
20 resultsLatest: Jun 2021
cute80/Scene Text Recognition
20 resultsLatest: Aug 2023
svtp/Scene Text Recognition
19 resultsLatest: Aug 2023
Curved Text in the Wild 1500/Scene Text Detection
18 resultsLatest: Feb 2022
tobacco-3482/Document Image Classification
18 resultsLatest: Jan 2023
codesearchnet---javascript/Optical Character Recognition
14 resultsLatest: Dec 2024

+141 more...

Multimodal14

MMMU/Image-Text-to-Text
36 resultsLatest: May 2026
Video-MME/Video Understanding
24 resultsLatest: Apr 2026
MMStar/Image-Text-to-Text
21 resultsLatest: May 2026
MVBench/Video Understanding
20 resultsLatest: Apr 2026
GenEval/Text-to-Image Generation
8 resultsLatest: May 2026
AudioBench/Audio-Text-to-Text
0 resultsNo paper dates
MagicBrush/Image-Text-to-Image
0 resultsNo paper dates
DPG-Bench/Text-to-Image Generation
0 resultsNo paper dates
MMBench/Image-Text-to-Text
0 resultsNo paper dates
InstructPix2Pix/Image-Text-to-Image
0 resultsNo paper dates
VideoBench/Image-Text-to-Video
0 resultsNo paper dates
MJHQ-30K FID/Text-to-Image Generation
0 resultsNo paper dates
DEMON Bench/Any-to-Any
0 resultsNo paper dates
ViDoRe/Cross-Modal Retrieval
0 resultsNo paper dates

Natural Language Processing12

SuperGLUE/Text classification
8 resultsLatest: Jul 2024
MTEB Leaderboard/Feature Extraction
44 resultsLatest: May 2026
BEIR/Text Ranking
5 resultsLatest: Dec 2024
MS MARCO/Text Ranking
4 resultsLatest: Oct 2023
WMT'23/Machine Translation
4 resultsNo paper dates
STS Benchmark/Semantic Textual Similarity
3 resultsLatest: Jan 2024
WikiTableQuestions/Table Question Answering
3 resultsLatest: Apr 2020
XNLI/Zero-Shot Classification
3 resultsLatest: Jan 2023
GLUE/Fill-Mask
3 resultsLatest: Jan 2023
FLORES-200/Machine Translation
0 resultsNo paper dates
SQA/Table Question Answering
0 resultsNo paper dates
WikiText Perplexity/Language Modeling
0 resultsNo paper dates

Reasoning9

46 resultsLatest: May 2026
HellaSwag/Commonsense Reasoning
17 resultsLatest: May 2026
WinoGrande/Commonsense Reasoning
13 resultsLatest: May 2026
11 resultsLatest: Feb 2026
AI2 Reasoning Challenge/Commonsense Reasoning
10 resultsNo paper dates
CommonsenseQA/Commonsense Reasoning
5 resultsLatest: Apr 2025
3 resultsNo paper dates
2 resultsNo paper dates
StrategyQA/Multi-step Reasoning
2 resultsNo paper dates

Medical8

7 resultsNo paper dates
4 resultsNo paper dates
3 resultsNo paper dates
RSNA Pneumonia Detection Challenge/Disease Classification
3 resultsLatest: Jan 2024
2 resultsNo paper dates
2 resultsNo paper dates
COVID-19 Image Data Collection/Disease Classification
2 resultsNo paper dates
1 resultsNo paper dates

Audio8

AudioCaps/Audio Captioning
7 resultsLatest: Jul 2025
The LJ Speech Dataset/Text-to-speech
5 resultsLatest: Jun 2024
4 resultsLatest: Dec 2022
MusicCaps/Music Generation
3 resultsNo paper dates
DIHARD/Voice Activity Detection
0 resultsNo paper dates
0 resultsNo paper dates
DNS Challenge/Audio-to-Audio
0 resultsNo paper dates
AVA-Speech/Voice Activity Detection
0 resultsNo paper dates

Computer Code7

61 resultsLatest: Apr 2026
22 resultsLatest: May 2025
12 resultsLatest: Apr 2026
MBPP+ Extended Version/Code Generation
9 resultsLatest: Apr 2026
0 resultsNo paper dates
0 resultsNo paper dates

Industrial Inspection6

1 resultsNo paper dates
6 resultsLatest: Mar 2025
6 resultsLatest: Aug 2024
Visual Anomaly Dataset/Anomaly Detection
3 resultsNo paper dates
1 resultsNo paper dates
NEU Surface Defect Database/Anomaly Detection
1 resultsNo paper dates

Graphs2

Open Graph Benchmark/Node Classification
0 resultsNo paper dates
OGB (Open Graph Benchmark)/Graph Classification
0 resultsNo paper dates

Time Series2

OpenML-CC18/Tabular Classification
5 resultsLatest: Jun 2025
California Housing/Tabular Regression
2 resultsNo paper dates

Other2

SIMPLER/Robotics
0 resultsNo paper dates
RLBench/Robotics
0 resultsNo paper dates

Speech1

Mozilla Common Voice/Speech Recognition
4 resultsLatest: Dec 2022

Agentic AI1

0 resultsNo paper dates

Methodology1

ImageNet Linear Probe/Self-Supervised Learning
0 resultsNo paper dates