Navigate:
All Reposfish-speech
~$FISHSP0.1%

Fish Speech: Open source text-to-speech synthesis

Transformer-based TTS with voice cloning from reference audio.

LIVE RANKINGS • 06:52 AM • STEADY
TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100
OVERALL
#100
8
AI & ML
#48
2
30 DAY RANKING TREND
ovr#100
·AI#48
STARS
24.5K
FORKS
2.0K
DOWNLOADS
7D STARS
+29
7D FORKS
+10
Tags:
See Repo:
Share:

Learn more about fish-speech

Fish Speech is an open-source text-to-speech synthesis system that generates natural speech audio from text input using transformer-based neural network architectures. The system implements voice cloning capabilities by analyzing reference audio samples to extract speaker characteristics, which are then applied during the synthesis process to reproduce the target voice. It processes text through multiple stages including linguistic analysis, acoustic feature prediction, and neural vocoding to produce waveform output. The architecture separates the text-to-acoustic-feature generation from the vocoding stage, allowing for modular optimization of each component in the speech synthesis pipeline.

fish-speech

1

Transformer-based architecture

Uses transformer models for semantic token prediction combined with VQVAE quantization, enabling efficient discrete representation of speech content.

2

Emotional speech control

Supports multiple emotional markers and tone specifications during synthesis, allowing fine-grained control over prosody and expression in generated speech.

3

Voice cloning from samples

Enables speaker adaptation through reference audio input, allowing synthesis in arbitrary speaker voices without requiring extensive speaker-specific training data.


from fish_speech import TextToSpeech

tts = TextToSpeech()
audio = tts.synthesize("Hello, this is a test of Fish Speech synthesis.")
audio.save("output.wav")

vv1.5.1

Final stable release before next model version; no breaking changes or migration steps documented.

  • Pin to v1.5.1 if you need stability before the upcoming model architecture changes.
  • Release notes do not specify bug fixes, feature additions, or compatibility requirements.
vv1.5.0

Fish Speech 1.5 completes both inference and fine-tuning pipelines; release notes do not specify breaking changes or upgrade requirements.

  • Verify inference and fine-tuning workflows function as expected in your environment after upgrading to v1.5.0.
  • Consult repository documentation for API changes or new dependencies, as release notes omit migration details.
vv1.4.3

Final stable release in the 1.4 series before major version 1.5; pin dependencies now if you need stability.

  • Pin to v1.4.3 in production to avoid breaking changes expected in the upcoming 1.5 release.
  • Release notes do not specify breaking changes, new requirements, or migration steps for this version.

See how people are using fish-speech

Loading tweets...


[ EXPLORE MORE ]

Related Repositories

Discover similar tools and frameworks used by developers