Kimi K2.5: Visual Agentic Intelligence

arXiv:2602.02276Submitted Feb 2, 20269 benchmark results

Authors pending

Tasks

Results

9 results reproduced from this paper.

MetricSort byDirectionSorted instantly in-page

Results

SOTA rows

Models

Datasets

#	Model	Vendor	Benchmark	Value	SOTA	Date	Source
01	Kimi-K2.5	Moonshot.AI	AIME 2025	96.1%	—	—	source ↗
02	Kimi-K2.5	Moonshot.AI	OmniDocBench	88.8%	—	—	source ↗
03	Kimi-K2.5	Moonshot.AI	GPQA Diamond	87.6%	—	—	source ↗
04	Kimi-K2.5	Moonshot.AI	Video-MME	87.4%	—	—	source ↗
05	Kimi-K2.5	Moonshot.AI	MMMU-Pro	78.5%	—	—	source ↗
06	Kimi-K2.5	Moonshot.AI	SWE-Bench Verified	76.8%	—	—	source ↗
07	Kimi-K2.5	Moonshot.AI	BrowseComp	60.6%	—	—	source ↗
08	Kimi-K2.5	Moonshot.AI	HLE	30.1%	—	—	source ↗