Codesota · Models · Qwen3-235B-A22BAlibaba10 results · 4 benchmarks
Model card

Qwen3-235B-A22B.

Alibabaopen-weights235B (22B active) paramsmoe

Qwen3 flagship MoE model, May 2025

§ 01 · Benchmarks

Every benchmark Qwen3-235B-A22B has a recorded score for.

#BenchmarkArea · TaskMetricValueRankDateSource
01LiveCodeBench ProComputer Code · Code Generationelo1673.00#5/9source ↗
02LiveCodeBenchComputer Code · Code Generationpass@170.7%#8/30source ↗
03GPQAReasoning · Multi-step Reasoningaccuracy71.1%#17/33source ↗
04PLCCNatural Language Processing · Polish Cultural Competencygrammar66.0%#61/165source ↗
05PLCCNatural Language Processing · Polish Cultural Competencygeography69.0%#93/165source ↗
06PLCCNatural Language Processing · Polish Cultural Competencyhistory70.0%#94/165source ↗
07PLCCNatural Language Processing · Polish Cultural Competencyaverage55.0%#103/165source ↗
08PLCCNatural Language Processing · Polish Cultural Competencyvocabulary43.0%#110/165source ↗
09PLCCNatural Language Processing · Polish Cultural Competencyculture-and-tradition45.0%#118/165source ↗
10PLCCNatural Language Processing · Polish Cultural Competencyart-and-entertainment37.0%#124/165source ↗
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area

Where Qwen3-235B-A22B actually performs.

Computer Code
2
benchmarks
avg rank #6.5
Reasoning
1
benchmark
avg rank #17.0
Natural Language Processing
1
benchmark
avg rank #100.4
§ 04 · Related models

Other Alibaba models scored on Codesota.

Qwen2-VL 72B
4 results
Qwen2.5-72B-Instruct
72B params · 4 results
Qwen2.5-Coder 32B
32B params · 4 results
GOT-OCR2.0
3 results
Qwen 3 72B
72B params · 2 results
Qwen2.5-VL 32B
2 results
Qwen2.5-VL 72B
72B params · 2 results
Qwen 3 14B
14B params · 1 result
§ 05 · Sources & freshness

Where these numbers come from.

sdadas/PLCC
7
results
livecodebench-pro-official
1
result
arxiv-2505.09388
1
result
qwen-model-card
1
result
9 of 10 rows marked verified.