Text-to-speech · blind preference test

Which voice sounds better?

Listen to both clips reading the same prompt. Every active model uses a male voice so comparisons are cleaner. Pick the one you would rather ship in a real product, then tell us why. Model names stay hidden until results are published, and after each vote the next pair is chosen by what would most sharpen the ranking — close races and under-tested voices come up more often.

Prompt
The API uses OAuth, JWT, TLS, and HTTP/2.
Voice A

Sample A

← vote
0:00Loading…
Voice B

Sample B

vote →
0:00Loading…
Loudness comparison
Overlaid RMS envelopes on a shared time axis. Differences in length, pace, and emphasis show up here before you hear them.
Sample ASample B
Analyzing clips…
Keyboard shortcuts
Prefer left (A)
Prefer right (B)
SpacePlay / pause active
TToggle active player
RRestart active
SToggle spectrogram
CToggle A/B compare
Anonymous comparisons0
Prompts30
Pairwise Elo · K=24 · same-prompt clips only
male voice pool
Provisional preference results

Blind Elo leaderboard

Names stay hidden during voting. Results below are preference ratings only, not the hard-text benchmark. This run uses the male-voice pool only. The interval is a provisional 95% uncertainty band from comparison volume, so low-vote rows should be treated as unstable.

Comparisons
0
Start Elo
1500
RankModelVendorElo95% CIVotesW-LSignal
Loading ratings...

Treat rows with low sample counts as unstable. Elo changes quickly until each model has enough pairwise comparisons.

Beyond this leaderboard

These votes train a model that scores any TTS system

The research note turns this preference signal into five comparable axes — preference, Whisper intelligibility, UTMOS naturalness, speaker similarity, and prosody — and asks which objective metric best predicts what you pick here.

Read the research →