Codesota · Registry log11,851 rows · 2707 new this monthShowing 200
Editorial · Registry log
Every score we've added, in order.
The append-only public ledger of every benchmark result on Codesota. When a row was written, when the result itself is dated, who the model was, what value was claimed, and where the citation lives. New-SOTA rows are marked in colour; unverified rows still show, but labelled.
This is the audit trail. If a score is wrong, this is where the error will be visible; if a source is missing, this is where you'll see the gap.
2026-05-26 · 39 rows
- 18:43GPT-4oHLE2.7%-51.28source ↗· verified
- 18:43Nova LiteHLE3.6%-50.36source ↗· verified
- 18:43Claude 3.5 SonnetHLE4.1%-49.92source ↗· verified
- 18:43Nova ProHLE4.4%-49.60source ↗· verified
- 18:43Mistral-Medium-3HLE4.5%-49.48source ↗· verified
- 18:43Gemini 1.5 ProHLE4.6%-49.40source ↗· verified
- 18:43GPT-4.5 PreviewHLE5.4%-48.56source ↗· verified
- 18:43Gemini 2.0 Flash ThinkingHLE6.6%-47.44source ↗· verified
- 18:43o1HLE8.0%-46.04source ↗· verified
- 18:43Claude 3.7 SonnetHLE8.0%-45.96source ↗· verified
- 18:43o1 ProHLE8.1%-45.88source ↗· verified
- 18:43Muse SparkHLE40.6%-13.44source ↗· verified
- 13:40GPT-4.1HLE5.4%-48.60source ↗· verified
- 13:40Llama 4 MaverickHLE5.7%-48.32source ↗· verified
- 13:40Claude Sonnet 4HLE7.8%-46.24source ↗· verified
- 13:40GLM-4.5-AirHLE8.1%-45.88source ↗· verified
- 13:40GLM-4.5HLE8.3%-45.68source ↗· verified
- 13:40Gemini 3.1 Flash-LiteHLE8.6%-45.36source ↗· verified
- 13:40Claude Opus 4HLE10.7%-43.28source ↗· verified
- 13:40Claude Opus 4.1HLE11.5%-42.48source ↗· verified
- 13:40Gemini 2.5 FlashHLE12.1%-41.92source ↗· verified
- 13:40Claude Sonnet 4.5HLE13.7%-40.28source ↗· verified
- 13:40o4-miniHLE18.1%-35.92source ↗· verified
- 13:40GPT-5 miniHLE19.4%-34.56source ↗· verified
- 13:40o3HLE20.3%-33.68source ↗· verified
- 13:40Gemini 2.5 ProHLE21.6%-32.36source ↗· verified
- 13:40GPT-5.1HLE23.7%-30.32source ↗· verified
- 13:40Kimi K2.5HLE24.4%-29.63source ↗· verified
- 13:40Claude Opus 4.5HLE25.2%-28.80source ↗· verified
- 13:40GPT-5HLE25.3%-28.68source ↗· verified
- 13:40GPT-5.2HLE27.8%-26.20source ↗· verified
- 13:40GPT-5 ProHLE31.6%-22.36source ↗· verified
- 13:40Claude Opus 4.6HLE34.4%-19.56source ↗· verified
- 13:40Claude Opus 4.7HLE36.2%-17.80source ↗· verified
- 13:40GPT-5.4HLE36.2%-17.76source ↗· verified
- 13:40Gemini 3 Pro PreviewHLE37.5%-16.48source ↗· verified
- 13:40GPT-5.4 ProHLE44.3%-9.68source ↗· verified
- 13:40Gemini 3.1 ProHLE46.4%-7.56source ↗· verified
- 13:27Gemini 3.1 ProLiveCodeBench Pro2887.00NEW SOTA+448.00source ↗· verified
2026-05-20 · 161 rows
- 16:00Gemini 2.5 FlashHLE11.0%-43.00source ↗· unverified
- 16:00Gemini 2.5 ProHLE21.6%-32.40source ↗· unverified
- 16:00GLM-4.5HLE14.4%-39.60source ↗· unverified
- 16:00DeepSeek-V4-Flash MaxHLE34.8%-19.20source ↗· unverified
- 16:00DeepSeek-V4-Pro MaxHLE37.7%-16.30source ↗· unverified
- 16:00GLM-5.1HLE31.0%-23.00source ↗· unverified
- 16:00Kimi K2.6HLE54.0%NEW SOTA+6.00source ↗· unverified
- 16:00MiMo-V2.5-ProHLE48.0%NEW SOTA+9.70source ↗· unverified
- 16:00Qwen3.5-397B-A17BHLE28.7%-9.60source ↗· unverified
- 16:00Gemma 4 31BHLE26.5%-11.80source ↗· unverified
- 16:00Kimi-K2.5HLE30.1%-8.20source ↗· unverified
- 16:00MiniMax-M2.5HLE19.4%-18.90source ↗· unverified
- 16:00Moonshine-streaming-tinyGigaSpeech13.9%-28.52source ↗· unverified
- 16:00Wav2vec2-base-960hVoxPopuli32.5%NEW SOTA+2.39source ↗· unverified
- 16:00Whisper-tiny.enVoxPopuli12.0%-18.09source ↗· unverified
- 16:00Niagara-19m-batch.enVoxPopuli9.9%-20.17source ↗· unverified
- 16:00Parakeet-rnnt-0.6bVoxPopuli6.1%-24.01source ↗· unverified
- 16:00Lite-whisper-large-v3-accVoxPopuli8.1%-21.98source ↗· unverified
- 16:00Qwen3-ASR-0.6BVoxPopuli7.1%-23.02source ↗· unverified
- 16:00Granite 4.0 1B SpeechVoxPopuli5.8%-24.25source ↗· unverified
- 16:00Wav2vec2-base-960hGigaSpeech30.9%-11.57source ↗· unverified
- 16:00Wav2vec2-base-960hSPGISpeech27.6%NEW SOTA+1.35source ↗· unverified
- 16:00Hubert-large-ls960-ftSPGISpeech18.9%-7.35source ↗· unverified
- 16:00Owsm_ctc_v3.1_1BSPGISpeech2.9%-23.34source ↗· unverified
- 16:00Parakeet-rnnt-0.6bSPGISpeech3.3%-22.89source ↗· unverified
- 16:00Distil-large-v3.5SPGISpeech2.9%-23.34source ↗· unverified
- 16:00Canary-1B-FlashSPGISpeech1.9%-24.26source ↗· unverified
- 16:00Parakeet-tdt-0.6b-v3SPGISpeech4.0%-22.23source ↗· unverified
- 16:00Parakeet-tdt-0.6b-v2SPGISpeech2.2%-24.04source ↗· unverified
- 16:00Granite Speech 3.3 8BVoxPopuli5.7%-24.37source ↗· unverified
- 16:00Granite Speech 4.1 2BVoxPopuli5.7%-24.39source ↗· unverified
- 16:00Moonshine-streaming-tinySPGISpeech6.2%-20.05source ↗· unverified
- 16:00Moonshine-streaming-tinyTED-LIUM6.1%-26.25source ↗· unverified
- 16:00Mms-1b-fl102SPGISpeech26.2%NEW SOTA+0.75source ↗· unverified
- 16:00Niagara-38m-batch.enSPGISpeech3.1%-22.36source ↗· unverified
- 16:00Wav2vec2-large-robust-ft-libri-960hOpen ASR Leaderboard503.81-5895.44source ↗· unverified
- 16:00Wav2vec2-large-robust-ft-libri-960hOpen ASR Leaderboard22.9%-6376.32source ↗· unverified
- 16:00Moonshine Streaming MediumOpen ASR Leaderboard448.15-5951.10source ↗· unverified
- 16:00Moonshine Streaming MediumOpen ASR Leaderboard6.7%-6392.59source ↗· unverified
- 16:00Mms-1b-fl102TED-LIUM32.4%NEW SOTA+11.30source ↗· unverified
- 16:00Mms-1b-fl102Open ASR Leaderboard234.42-6164.83source ↗· unverified
- 16:00Mms-1b-fl102Open ASR Leaderboard39.8%-6359.45source ↗· unverified
- 16:00Cohere Transcribe (Mar 2026)Earnings-2210.9%-41.01source ↗· unverified
- 16:00Wav2vec2-base-960hTED-LIUM21.1%NEW SOTA+1.56source ↗· unverified
- 16:00Mms-1b-fl102AMI-IHM86.8%NEW SOTA+39.51source ↗· unverified
- 16:00Cohere Transcribe (Mar 2026)AMI-IHM8.1%-39.14source ↗· unverified
- 16:00Granite Speech 4.1 2BAMI-IHM8.1%-39.18source ↗· unverified
- 16:00Mms-1b-fl102VoxPopuli28.0%-2.12source ↗· unverified
- 16:00Canary-Qwen-2.5BVoxPopuli5.7%-24.43source ↗· unverified
- 16:00Granite Speech 4.1 2BTED-LIUM3.1%-16.42source ↗· unverified
- 16:00Data2vec-audio-base-960hOpen ASR Leaderboard648.14-5751.11source ↗· unverified
- 16:00Data2vec-audio-base-960hOpen ASR Leaderboard28.3%-6370.95source ↗· unverified
- 16:00Data2vec-audio-base-960hEarnings-2249.6%-2.31source ↗· unverified
- 16:00Data2vec-audio-base-960hTED-LIUM19.5%NEW SOTA+0.64source ↗· unverified
- 16:00wav2vec 2.0 Large (960h)TED-LIUM18.9%NEW SOTA+1.37source ↗· unverified
- 16:00Data2vec-audio-base-960hVoxPopuli27.3%-2.84source ↗· unverified
- 16:00wav2vec 2.0 Large (960h)VoxPopuli30.1%NEW SOTA+6.23source ↗· unverified
- 16:00Wav2vec2-base-960hAMI-IHM45.6%-1.71source ↗· unverified
- 16:00Data2vec-audio-base-960hSPGISpeech25.5%NEW SOTA+2.64source ↗· unverified
- 16:00wav2vec 2.0 Large (960h)GigaSpeech27.7%-14.68source ↗· unverified
- 16:00Wav2vec2-conformer-rel-pos-large-960h-ftEarnings-2238.3%-13.54source ↗· unverified
- 16:00Wav2vec2-conformer-rope-large-960h-ftEarnings-2237.5%-14.35source ↗· unverified
- 16:00Wav2vec2-conformer-rope-large-960h-ftTED-LIUM15.9%-1.56source ↗· unverified
- 16:00Wav2vec2-conformer-rel-pos-large-960h-ftVoxPopuli22.4%-1.47source ↗· unverified
- 16:00Wav2vec2-conformer-rope-large-960h-ftVoxPopuli22.6%-1.25source ↗· unverified
- 16:00wav2vec 2.0 Large (960h)AMI-IHM42.7%-4.61source ↗· unverified
- 16:00Wav2vec2-conformer-rel-pos-large-960h-ftAMI-IHM42.4%-4.88source ↗· unverified
- 16:00Wav2vec2-conformer-rel-pos-large-960h-ftSPGISpeech18.9%-3.97source ↗· unverified
- 16:00Wav2vec2-conformer-rel-pos-large-960h-ftGigaSpeech25.0%-17.46source ↗· unverified
- 16:00Wav2vec2-conformer-rope-large-960h-ftGigaSpeech25.0%-17.42source ↗· unverified
- 16:00Wav2vec2-conformer-rope-large-960h-ftOpen ASR Leaderboard607.87-5791.38source ↗· unverified
- 16:00Wav2vec2-conformer-rope-large-960h-ftOpen ASR Leaderboard23.3%-6375.97source ↗· unverified
- 16:00Wav2vec2-large-robust-ft-libri-960hEarnings-2236.2%-15.65source ↗· unverified
- 16:00Wav2vec2-large-robust-ft-libri-960hVoxPopuli23.3%-0.59source ↗· unverified
- 16:00Wav2vec2-conformer-rope-large-960h-ftAMI-IHM42.5%-4.80source ↗· unverified
- 16:00Wav2vec2-large-robust-ft-libri-960hAMI-IHM37.8%-9.52source ↗· unverified
- 16:00Data2vec-audio-large-960hGigaSpeech24.8%-17.62source ↗· unverified
- 16:00Hubert-large-ls960-ftOpen ASR Leaderboard495.86-5903.39source ↗· unverified
- 16:00Hubert-large-ls960-ftOpen ASR Leaderboard22.7%-6376.56source ↗· unverified
- 16:00Hubert-xlarge-ls960-ftOpen ASR Leaderboard361.32-6037.93source ↗· unverified
- 16:00Hubert-xlarge-ls960-ftOpen ASR Leaderboard22.5%-6376.70source ↗· unverified
- 16:00Hubert-large-ls960-ftVoxPopuli22.7%-1.16source ↗· unverified
- 16:00Mms-1b-allSPGISpeech16.9%-5.87source ↗· unverified
- 16:00Mms-1b-allVoxPopuli17.6%-6.23source ↗· unverified
- 16:00Hubert-large-ls960-ftGigaSpeech25.0%-17.41source ↗· unverified
- 16:00Hubert-xlarge-ls960-ftGigaSpeech24.7%-17.68source ↗· unverified
- 16:00Wav2vec2-large-960h-lv60-selfOpen ASR Leaderboard509.32-5889.93source ↗· unverified
- 16:00Wav2vec2-large-960h-lv60-selfOpen ASR Leaderboard21.3%-6377.98source ↗· unverified
- 16:00Cohere Transcribe (Mar 2026)Open ASR Leaderboard524.88-5874.37source ↗· unverified
- 16:00Cohere Transcribe (Mar 2026)Open ASR Leaderboard5.4%-6393.83source ↗· unverified
- 16:00Mms-1b-allEarnings-2231.2%-20.70source ↗· unverified
- 16:00Wav2vec2-large-960h-lv60-selfEarnings-2231.7%-20.19source ↗· unverified
- 16:00Mms-1b-allGigaSpeech26.4%-15.98source ↗· unverified
- 16:00Wav2vec2-large-960h-lv60-selfTED-LIUM14.9%-2.60source ↗· unverified
- 16:00Asr-wav2vec2-librispeechVoxPopuli13.7%-10.14source ↗· unverified
- 16:00Mms-1b-allAMI-IHM42.0%-5.25source ↗· unverified
- 16:00Wav2vec2-large-960h-lv60-selfAMI-IHM36.8%-10.50source ↗· unverified
- 16:00Asr-wav2vec2-librispeechTED-LIUM7.6%-9.90source ↗· unverified
- 16:00Wav2vec2-large-960h-lv60-selfSPGISpeech17.9%-4.88source ↗· unverified
- 16:00Wav2vec2-large-960h-lv60-selfGigaSpeech23.9%-18.48source ↗· unverified
- 16:00Whisper-tiny.enOpen ASR Leaderboard348.12-6051.13source ↗· unverified
- 16:00Whisper-tiny.enOpen ASR Leaderboard12.8%-6386.44source ↗· unverified
- 16:00Moonshine-tinyOpen ASR Leaderboard753.06-5646.19source ↗· unverified
- 16:00Moonshine-tinyOpen ASR Leaderboard12.7%-6386.60source ↗· unverified
- 16:00Moonshine-tinyEarnings-2220.7%-31.14source ↗· unverified
- 16:00Whisper-tiny.enTED-LIUM6.0%-11.51source ↗· unverified
- 16:00Moonshine-tinyTED-LIUM5.7%-11.79source ↗· unverified
- 16:00Moonshine-tinyVoxPopuli14.1%-9.75source ↗· unverified
- 16:00Moonshine-streaming-tinyVoxPopuli14.0%-9.84source ↗· unverified
- 16:00Whisper-tiny.enSPGISpeech5.9%-16.89source ↗· unverified
- 16:00Whisper-tiny.enGigaSpeech14.1%-28.34source ↗· unverified
- 16:00Moonshine-tinyGigaSpeech14.2%-28.21source ↗· unverified
- 16:00Moonshine-tinySPGISpeech7.4%-15.39source ↗· unverified
- 16:00Stt_en_conformer_ctc_smallOpen ASR Leaderboard5686.90-712.35source ↗· unverified
- 16:00Stt_en_conformer_ctc_smallOpen ASR Leaderboard11.2%-6388.09source ↗· unverified
- 16:00Stt_en_conformer_ctc_smallAMI-IHM20.4%-26.84source ↗· unverified
- 16:00Stt_en_conformer_ctc_smallTED-LIUM7.2%-10.32source ↗· unverified
- 16:00Stt_en_conformer_ctc_smallGigaSpeech14.5%-27.96source ↗· unverified
- 16:00Moonshine-streaming-tinyEarnings-2220.2%-31.68source ↗· unverified
- 16:00Stt_en_conformer_ctc_smallSPGISpeech7.8%-15.02source ↗· unverified
- 16:00Moonshine-streaming-tinyAMI-IHM19.0%-28.25source ↗· unverified
- 16:00Moonshine-streaming-tinyOpen ASR Leaderboard847.20-5552.05source ↗· unverified
- 16:00Moonshine-streaming-tinyOpen ASR Leaderboard12.0%-6387.25source ↗· unverified
- 16:00Whisper-base.enOpen ASR Leaderboard320.67-6078.58source ↗· unverified
- 16:00Whisper-base.enOpen ASR Leaderboard10.3%-6388.93source ↗· unverified
- 16:00Moonshine-baseEarnings-2216.9%-35.02source ↗· unverified
- 16:00Whisper-base.enTED-LIUM4.9%-12.61source ↗· unverified
- 16:00Moonshine-baseTED-LIUM5.7%-11.83source ↗· unverified
- 16:00Whisper-base.enVoxPopuli9.8%-14.10source ↗· unverified
- 16:00Moonshine-baseSPGISpeech5.5%-17.36source ↗· unverified
- 16:00Moonshine-baseGigaSpeech12.1%-30.34source ↗· unverified
- 16:00Stt_en_fastconformer_transducer_largeEarnings-2219.4%-32.46source ↗· unverified
- 16:00Stt_en_fastconformer_transducer_largeTED-LIUM4.5%-13.02source ↗· unverified
- 16:00Stt_en_fastconformer_transducer_largeVoxPopuli6.5%-17.41source ↗· unverified
- 16:00Stt_en_fastconformer_ctc_largeVoxPopuli6.3%-17.52source ↗· unverified
- 16:00Moonshine-baseAMI-IHM17.5%-29.78source ↗· unverified
- 16:00Stt_en_fastconformer_ctc_largeOpen ASR Leaderboard6399.25NEW SOTA+1054.11source ↗· unverified
- 16:00Stt_en_fastconformer_ctc_largeOpen ASR Leaderboard9.0%-5336.18source ↗· unverified
- 16:00Distil-medium.enOpen ASR Leaderboard279.73-5065.41source ↗· unverified
- 16:00Distil-medium.enOpen ASR Leaderboard8.8%-5336.38source ↗· unverified
- 16:00Niagara-38m-batch.enTED-LIUM5.7%-11.77source ↗· unverified
- 16:00Niagara-38m-batch.enEarnings-2212.1%-39.78source ↗· unverified
- 16:00Whisper-small.enTED-LIUM4.1%-13.41source ↗· unverified
- 16:00Niagara-38m-batch.enVoxPopuli8.7%-15.13source ↗· unverified
- 16:00Distil-medium.enAMI-IHM16.1%-31.15source ↗· unverified
- 16:00Distil-medium.enSPGISpeech3.8%-18.99source ↗· unverified
- 16:00Whisper-small.enVoxPopuli8.5%-15.36source ↗· unverified
- 16:00Whisper-small.enGigaSpeech11.3%-31.07source ↗· unverified
- 16:00Distil-small.enSPGISpeech3.8%-19.00source ↗· unverified
- 16:00Asr-conformer-loquaciousSPGISpeech4.1%-18.71source ↗· unverified
- 16:00Whisper-small.enEarnings-2213.0%-38.90source ↗· unverified
- 16:00Distil-small.enEarnings-2213.2%-38.72source ↗· unverified
- 16:00Asr-conformer-loquaciousVoxPopuli6.9%-16.97source ↗· unverified
- 16:00Distil-small.enAMI-IHM16.2%-31.11source ↗· unverified
- 16:00Asr-conformer-loquaciousGigaSpeech11.4%-31.06source ↗· unverified
- 16:00Asr-conformer-loquaciousOpen ASR Leaderboard42.2%-5302.98source ↗· unverified
- 16:00Asr-conformer-loquaciousOpen ASR Leaderboard8.5%-5336.66source ↗· unverified
- 16:00Whisper-medium.enSPGISpeech3.3%-19.49source ↗· unverified
- 16:00Whisper LargeGigaSpeech10.8%-31.66source ↗· unverified
- 16:00Whisper LargeSPGISpeech3.2%-19.62source ↗· unverified
- 16:00Owsm_ctc_v3.1_1BEarnings-2213.7%-38.13source ↗· unverified
Showing the 200 most-recent rows. To inspect a single dataset’s history, append ?dataset=ID (e.g. /log?dataset=mmmu). Delta compares each row to the prior-best value on the same dataset at the moment this row was added. Hidden datasets and hidden models are not shown.