Navigate:
~$TTS0.1%

🐸TTS: Text-to-Speech deep learning toolkit

PyTorch toolkit for deep learning text-to-speech synthesis.

LIVE RANKINGS • 06:52 AM • STEADY
TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100
OVERALL
#84
9
AI & ML
#40
2
30 DAY RANKING TREND
ovr#84
·AI#40
STARS
44.2K
FORKS
5.9K
DOWNLOADS
14
7D STARS
+47
7D FORKS
+11
Tags:
See Repo:
Share:

Learn more about TTS

🐸TTS is a PyTorch-based deep learning library for text-to-speech synthesis that implements multiple model architectures including Tacotron, Glow-TTS, and XTTS. The toolkit combines acoustic models for converting text to mel-spectrograms with vocoder models like HiFi-GAN and MelGAN for converting spectrograms to waveforms. It supports multi-speaker synthesis, voice cloning, voice conversion, and speaker encoding capabilities. The library is used in both research contexts and production deployments, with support for over 1100 languages through integration with Fairseq models.

TTS

1

Multi-architecture support

Implements various model architectures including Tacotron, Glow-TTS, XTTS, Tortoise, and Bark, allowing users to select approaches suited to their specific requirements. Integration with Fairseq models provides access to additional language coverage.

2

Voice cloning and conversion

Includes speaker encoder components and voice cloning capabilities that enable synthesis with new speaker characteristics. XTTS supports streaming inference with reported latency under 200ms.

3

Training and fine-tuning tools

Provides utilities for dataset analysis, curation, and model training from scratch or fine-tuning existing models. Example recipes are available for common datasets like LJSpeech.


npm install @google-cloud/text-to-speech

vv0.22.0

Adds multi-GPU training for XTTS, studio speakers to open-source XTTS, and fixes Chinese speech pause handling; no breaking changes noted.

  • Enable multi-GPU training for XTTS models to scale training workloads across hardware.
  • Use new studio speaker voices now available in open-source XTTS for improved voice quality.
vv0.21.3

Adds a Gradio UI for no-code XTTS fine-tuning; no breaking changes or new requirements noted.

  • Use the new Gradio demo to fine-tune XTTS models without code, runnable locally, on Colab, or on a server.
  • Follow the step-by-step video tutorial or XTTS docs to train custom voice models with your own audio data.
vv0.21.2

Adds versioned XTTS model loading and optional sentence splitting; fixes punctuation handling in text preprocessing.

  • Load specific XTTS versions by appending version tags to model names (e.g., `xtts_v2.0.2`) or omit for latest.
  • Set `split_sentences=False` in `tts_to_file()` to disable automatic sentence splitting and apply custom text logic.


[ EXPLORE MORE ]

Related Repositories

Discover similar tools and frameworks used by developers