Codesota · Models · Qwen2.5-72B-InstructAlibaba4 results · 4 benchmarks
Model card

Qwen2.5-72B-Instruct.

Alibabaopen-source72B paramsDense Transformer

Qwen2.5-72B-Instruct. Released September 2024. Strong open-source model. Instruct-tuned version of the Qwen2.5-72B base. Top open-source model on many reasoning benchmarks at release.

§ 01 · Benchmarks

Every benchmark Qwen2.5-72B-Instruct has a recorded score for.

#BenchmarkArea · TaskMetricValueRankDateSource
01GSM8KReasoning · Mathematical Reasoningaccuracy95.8%#18/32source ↗
02MATHReasoning · Mathematical Reasoningaccuracy83.1%#24/34source ↗
03GPQAReasoning · Multi-step Reasoningaccuracy49.0%#30/33source ↗
04MMLUReasoning · Commonsense Reasoningaccuracy86.1%#33/41source ↗
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area

Where Qwen2.5-72B-Instruct actually performs.

Reasoning
4
benchmarks
avg rank #26.3
§ 04 · Related models

Other Alibaba models scored on Codesota.

Qwen2-VL 72B
4 results
Qwen2.5-Coder 32B
32B params · 4 results
GOT-OCR2.0
3 results
Qwen 3 72B
72B params · 2 results
Qwen2.5-VL 32B
2 results
Qwen2.5-VL 72B
72B params · 2 results
Qwen 3 14B
14B params · 1 result
Qwen2-VL 7B
7B params · 1 result
§ 05 · Sources & freshness

Where these numbers come from.

qwen25-tech-report
4
results
4 of 4 rows marked verified.