Recent studyBlind TTS Elo is live. Compare two anonymous voice samples, vote after listening, and help separate real preference signal from noise.Vote in the study ->
Codesota · Tasks · Table Question AnsweringHome/Tasks/Natural Language Processing/Table Question Answering
Natural Language Processing· table-question-answering

Table Question Answering.

Table question answering bridges natural language and structured data — asking "what was Q3 revenue?" over a spreadsheet and getting the right cell or computed answer. Google's TAPAS (2020) pioneered joint table-text pre-training, and TAPEX trained on synthetic SQL execution traces to teach models tabular reasoning. The field shifted dramatically when GPT-4 and Claude demonstrated they could reason over tables in-context without any table-specific fine-tuning, often matching or beating specialized models on WikiTableQuestions and SQA. The hard frontier is multi-step numerical reasoning over large tables with hundreds of rows — exactly the kind of task where tool-augmented LLMs that generate and execute code are pulling ahead of pure neural approaches.

2
Datasets
3
Results
accuracy
Canonical metric
§ 02 · Canonical benchmark

The reference dataset.

WikiTableQuestions

Question answering over Wikipedia tables requiring compositional reasoning

Primary metric: accuracy
View full leaderboard →
§ 03 · Top 10

Leading models.

Leading models on WikiTableQuestions.

#ModelaccuracyYearSource
GPT-475.32024paper ↗
2Claude 3.5 Sonnet73.02025paper ↗
3TAPAS-large48.72020paper ↗

What were you looking for on Table Question Answering?

Didn't find the model, metric, or dataset you needed? Tell us in one line. We read every message and reply within 48 hours.

§ 04 · All datasets

Tracked datasets.

2 datasets tracked for this task.

WikiTableQuestions
CANONICAL
3 results · accuracy
Top: GPT-4 75.3
SQA
0 results · accuracy
§ 05 · Related tasks

Other tasks in Natural Language Processing.

Feature ExtractionFill-MaskNamed Entity RecognitionNatural Language InferencePolish Conversation QualityPolish Cultural CompetencyPolish Emotional IntelligencePolish LLM General
Reply within 48 hours · No newsletter

Didn't find what you came for?

Still looking for something on Table Question Answering? A missing model, a stale score, a benchmark we should cover — drop it here and we'll handle it.

Real humans read every message. We track what people are asking for and prioritize accordingly.