WinoGrande

Name: WinoGrande Benchmark Results
Creator: Unknown
License: https://creativecommons.org/licenses/by/4.0/

Unknown

44K Winograd-style problems requiring commonsense reasoning to resolve pronoun references.

Models3

Papers3

Metrics1

SOTA History

Not enough data to show trend.

Only 3 models on this benchmark

Help build the community leaderboard — submit your model results.

Higher is better

Rank	Model	Source	Score	Year	Paper
1	gpt-4o Pronoun resolution requiring commonsense reasoning.	Editorial	87.5	2025	Source
2	claude-35-sonnet	Editorial	85.4	2025	Source
3	llama-3-70b	Editorial	85.3	2025	Source