Codesota · Papers2026-02-17
Paper

GLM-5: from Vibe Coding to Agentic Engineering

arXiv ↗Code ↗
§ 01 · Benchmark results

8 results reproduced from this paper.

View:
Sorted instantly in-page
Results
8
SOTA rows
1
Models
2
Datasets
0
#ModelVendorBenchmarkValueSOTADateSource
01GLM-5Zhipu AITau2-Bench89.7%#1source ↗
02GLM-5.1GPQA Diamond86.2%source ↗
03GLM-5Zhipu AIGPQA Diamond86.0%source ↗
04GLM-5Zhipu AISWE-Bench Verified77.8%source ↗
05GLM-5.1BrowseComp68.0%source ↗
06GLM-5Zhipu AIBrowseComp62.0%source ↗
07GLM-5.1HLE31.0%source ↗
08GLM-5Zhipu AIHLE30.5%source ↗
Benchmark trail
§ 02 · Models

2 models from this paper.

evaluates
GLM-5
Zhipu AI
evaluates
GLM-5.1
Read next

Three places to go from here.

Index
All papers
All tracked papers in the registry, with benchmark result, model, and leaderboard linkage where available.
Replacement
Papers with Code is dead — alternatives
What replaced PWC for each use case: LLMs, OCR, speech, vision, robotics.
Top hub
LLM benchmarks
Every frontier LLM benchmark, scored.