StrategyQA

Unknown

2,780 yes/no questions requiring implicit multi-step reasoning to answer.

Benchmark Stats

Models2
Papers2
Metrics1

SOTA History

Not enough data to show trend.

Only 2 models on this benchmark

Help build the community leaderboard — submit your model results.

accuracy

accuracy

Higher is better

RankModelSourceScoreYearPaper
1gpt-4o

Strategy questions requiring implicit multi-step reasoning.

Editorial82.12025Source
2claude-35-sonnetEditorial79.82025Source

Submit a Result