Skip to content

Text-to-Speech Recipe Users can create speech signals from an input text by using text-to-speech (TTS), also referred to as speech synthesis. Popular TTS and Vocoder models, such as Tacotron 2, are supported by SpeechBrain (e.g, HiFIGAN).

Notifications You must be signed in to change notification settings

acamargofb/Speech

 
 

Repository files navigation

# Text-to-Speech Recipe

Users can create speech signals from an input text by using text-to-speech (TTS), also referred to as speech synthesis. Popular TTS and Vocoder models, such as Tacotron 2, are supported by SpeechBrain (e.g, HiFIGAN).

# Speech Separation Recipe

With SpeechBrain, a SepFormer model for speech source separation was created, and it was pretrained on the WHAMR! dataset, which is essentially a variation of the WSJ0-Mix dataset with ambient noise and reverberation. We strongly advise you to learn more about SpeechBrain for a better experience. The model's performance on the test set of the WHAMR! dataset is 13.7 dB SI-SNRi.

# Speech Enhancement Recipe

Several techniques are currently accessible in SpeechBrain, including spectrum masking, spectral mapping, and time-domain improvement. Additionally, separation techniques like Conv-TasNet, DualPath RNN, and SepFormer are used.

# Speaker Recognition Recipe

A wide range of practical applications now use speaker recognition. For speaker recognition, SpeechBrain offers a variety of models, including X-vector, ECAPA-TDNN, PLDA, and contrastive learning.

References

About

Text-to-Speech Recipe Users can create speech signals from an input text by using text-to-speech (TTS), also referred to as speech synthesis. Popular TTS and Vocoder models, such as Tacotron 2, are supported by SpeechBrain (e.g, HiFIGAN).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%