StrategyQA

Unknown

2,780 yes/no questions requiring implicit multi-step reasoning to answer.

Benchmark Stats

Models2
Papers2
Metrics1

SOTA History

Coming Soon
Visual timeline of state-of-the-art progression over time will appear here.

accuracy

accuracy

Higher is better

RankModelCodeScorePaper / Source
1gpt-4o

Strategy questions requiring implicit multi-step reasoning.

-82.1arXiv Paper
2claude-35-sonnet-79.8arXiv Paper