Codesota · Models · Gemini 3 ProGoogle14 results · 12 benchmarks
Model card

Gemini 3 Pro.

GoogleapiUndisclosed params2 current SOTA

Google flagship model.

§ 01 · Benchmarks

Every benchmark Gemini 3 Pro has a recorded score for.

#BenchmarkArea · TaskMetricValueRankDateSource
01GPQAReasoning · Multi-step Reasoningaccuracy91.9%#1/33source ↗
02HLEReasoning · Multi-step Reasoningaccuracy38.3%#1/13unverified
03LiveCodeBench ProComputer Code · Code Generationelo2439.00#1/9source ↗
04MMLU-ProReasoning · Commonsense Reasoningaccuracy89.8%#2/202026-04-20source ↗
05SWE-bench VerifiedAgentic AI · Autonomous Codingpct_resolved78.8%#2/3source ↗
06MMMU-ProMultimodal · Visual Question Answeringaccuracy80.0%#3/52026-01-15source ↗
07Tau2-BenchAgentic AI · Tool Usepass_rate69.0%#3/82025-11-18source ↗
08MMLUReasoning · Commonsense Reasoningaccuracy91.4%#6/412026-01-01source ↗
09OmniDocBenchComputer Vision · Document Parsingcomposite90.3%#7/33source ↗
10SWE-BenchComputer Code · Code Generationresolve-rate-agentic77.4%#8/252026-01-01source ↗
11SWE-BenchComputer Code · Code Generationresolve-rate77.4%#10/322026-01-01source ↗
12SWE-BenchComputer Code · Code Generationresolve-rate-agentic76.2%#12/252025-12-01unverified
13SWE-Bench VerifiedComputer Code · Code Generationresolve-rate76.2%#12/39source ↗
14SWE-bench VerifiedAgentic AI · SWE-benchresolve-rate76.2%#19/81source ↗
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area

Where Gemini 3 Pro actually performs.

Reasoning
4
benchmarks
avg rank #2.5 · 2 SOTA
Multimodal
1
benchmark
avg rank #3.0
Computer Vision
1
benchmark
avg rank #7.0
Agentic AI
3
benchmarks
avg rank #8.0
Computer Code
3
benchmarks
avg rank #8.6
§ 03 · Papers

1 paper with results for Gemini 3 Pro.

  1. 2023-10-10· Computer Code· 1 result

    SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

    Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao et al.
§ 04 · Related models

Other Google models scored on Codesota.

Gemini 2.5 Pro
16 results · 3 SOTA
Gemini 1.5 Pro
12 results · 1 SOTA
Gemini 3.1 Pro
3 results · 1 SOTA
ViT-H/14
632M params · 2 results · 1 SOTA
CoCa (finetuned)
2.1B params · 1 result · 1 SOTA
Gemini 2.0 Flash
1 result · 1 SOTA
Gemini 3.1 Pro Preview
1 result · 1 SOTA
Noise2Music
Unknown params · 1 result · 1 SOTA
§ 05 · Sources & freshness

Where these numbers come from.

editorial
4
results
google-blog
2
results
livecodebench-pro-official
1
result
pricepertoken
1
result
artificialanalysis.ai
1
result
codesota-shadow-mmlu
1
result
paddleocr-paper
1
result
live-swe-agent
1
result
swebench-leaderboard
1
result
google-internal
1
result
6 of 14 rows marked verified. · first result 2025-11-18, latest 2026-04-20.