Natural Language Processingtable-question-answering

Table Question Answering

Table question answering bridges natural language and structured data — asking "what was Q3 revenue?" over a spreadsheet and getting the right cell or computed answer. Google's TAPAS (2020) pioneered joint table-text pre-training, and TAPEX trained on synthetic SQL execution traces to teach models tabular reasoning. The field shifted dramatically when GPT-4 and Claude demonstrated they could reason over tables in-context without any table-specific fine-tuning, often matching or beating specialized models on WikiTableQuestions and SQA. The hard frontier is multi-step numerical reasoning over large tables with hundreds of rows — exactly the kind of task where tool-augmented LLMs that generate and execute code are pulling ahead of pure neural approaches.

2
Datasets
3
Results
accuracy
Canonical metric
Canonical Benchmark

WikiTableQuestions

Question answering over Wikipedia tables requiring compositional reasoning

Primary metric: accuracy
View full leaderboard

Top 10

Leading models on WikiTableQuestions.

RankModelaccuracyYearSource
1
GPT-4
75.32024paper
2
Claude 3.5 Sonnet
73.02025paper
3
TAPAS-large
48.72020paper

All datasets

2 datasets tracked for this task.

Related tasks

Other tasks in Natural Language Processing.

Run Inference

Looking to run a model? HuggingFace hosts inference for this task type.

HuggingFace