What is the best open-source TTS model in 2026?

Sesame CSM is the highest-ranked open-source TTS model in the CodeSOTA catalog by MOS. Use the table to choose between naturalness, voice cloning, local deployment and latency.

Which open-source TTS model is best for local deployment?

Kokoro v1.0 is the best lightweight starting point in this catalog. Piper remains the practical CPU-first option when speed and small footprint matter more than voice naturalness.

Is open-source TTS as good as commercial TTS?

Open-source TTS is close for many short-form and self-hosted uses, but commercial APIs still tend to lead on managed voice libraries, streaming infrastructure, and production voice-cloning workflows.

Codesota · Speech · Open-source TTSSpeech/Best open-source TTSUpdated May 13, 2026

§ Open models

Best open-source TTS models.

Start with Sesame CSM when naturalness matters, Kokoro v1.0 when you need a small local model, XTTS v2 for voice cloning, and Piper when CPU latency matters more than polish.

§ 01 · Ranking

Open-source TTS shortlist

Ranked by the shared CodeSOTA TTS catalog. MOS is a starting signal; deployment choice depends on license, footprint, languages, inference speed and whether voice cloning is central.

Rank	Model	Vendor	MOS	Best fit	Architecture	Params	Source
1	Sesame CSM 2025 · Subjective	Sesame	4.7	dialogue and agents	Conversational Speech Model	1B+	source
2	Fish Audio S2 Pro 2026 · Subjective (ARXIV:2603.08823)	Fish Audio	4.6	multilingual apps	Dual-autoregressive transformer + RVQ audio codec	5B	source
3	Orpheus TTS 2025 · Subjective	Canopy Labs	4.6	style control	LLM-based (Llama backbone)	3B	source
4	Kokoro v1.0 2025 · Subjective	Hexgrad	4.5	edge and CPU	Lightweight autoregressive	82M	source
5	XTTS v2 2024 · VCTK / Subjective	Coqui	4.5	voice cloning	GPT-like + VITS decoder	467M	source
6	Fish Speech 1.5 2025 · Subjective	Fish Audio	4.4	multilingual apps	VQGAN + Transformer	500M	source
7	F5-TTS 2024 · Subjective	Shanghai AI Lab	4.4	voice cloning	Flow-matching (non-autoregressive)	335M	source
8	Dia 1.6B 2025 · Subjective	Nari Labs	4.3	dialogue and agents	Transformer + non-verbal tokens	1.6B	source
9	Spark-TTS 2025 · Subjective	SparkAudio	4.3	multilingual apps	Controllable Transformer	500M	source
10	Supertonic 3 2026 · Subjective	Supertone	4.2	local TTS	ONNX Runtime local inference	99M	source
11	Parler-TTS 2025 · Subjective	Hugging Face	4.1	local TTS	Prompt-controlled Transformer	880M	source
12	Piper 2023 · Subjective / Raspberry Pi	Rhasspy	3.6	edge and CPU	VITS (lightweight)	~20M	source

Choose open-source when

You need local inference, private audio handling, predictable marginal cost, model-level control, or custom fine-tuning.

Choose API when

You need managed voices, streaming infrastructure, uptime, commercial voice-cloning flows, or the fastest path to a production voice agent.

Validate before shipping

Run names, numbers, dates, URLs, acronyms, and domain terms through your exact prompts. Naturalness does not guarantee information fidelity.

Related CodeSOTA pages

Best open-source TTS models.

Open-source TTS shortlist

Choose open-source when

Choose API when

Validate before shipping

Full TTS registry

TTS models guide

TTS intelligibility benchmark

Best TTS for real-time