Text-to-speech · blind preference test

Which voice sounds better?

Listen to both clips reading the same prompt. Every active model uses a male voice so comparisons are cleaner. Pick the one you would rather ship in a real product, then tell us why. Model names stay hidden until results are published, and after each vote the next pair is chosen by what would most sharpen the ranking — close races and under-tested voices come up more often.

TTS registry Jump to Elo results Research note

Prompt

The API uses OAuth, JWT, TLS, and HTTP/2.

Voice A

Sample A

← vote

0:00Loading…

Voice B

Sample B

vote →

0:00Loading…

Loudness comparison

Overlaid RMS envelopes on a shared time axis. Differences in length, pace, and emphasis show up here before you hear them.

Sample ASample B

Analyzing clips…

Keyboard shortcuts

←Prefer left (A)

→Prefer right (B)

SpacePlay / pause active

TToggle active player

RRestart active

SToggle spectrogram

CToggle A/B compare

Anonymous comparisons0

Prompts30

Pairwise Elo · K=24 · same-prompt clips only

male voice pool

Provisional preference results

Blind Elo leaderboard

Names stay hidden during voting. Results below are preference ratings only, not the hard-text benchmark. This run uses the male-voice pool only. The interval is a provisional 95% uncertainty band from comparison volume, so low-vote rows should be treated as unstable.

How these votes train a model that scores new TTS systems →

Comparisons

Start Elo

1500

Rank	Model	Vendor	Elo	95% CI	Votes	W-L	Signal
Loading ratings...

Treat rows with low sample counts as unstable. Elo changes quickly until each model has enough pairwise comparisons.

Beyond this leaderboard

These votes train a model that scores any TTS system

The research note turns this preference signal into five comparable axes — preference, Whisper intelligibility, UTMOS naturalness, speaker similarity, and prosody — and asks which objective metric best predicts what you pick here.

Read the research →