Is this AlignTTS training healthy? #580
Replies: 6 comments 21 replies
-
you also need to post the figures for the alignment and the outputs |
Beta Was this translation helpful? Give feedback.
-
You're right, here's the current status at 2925 epoch Eval Here's the test audio from the output folder |
Beta Was this translation helpful? Give feedback.
-
Here's final output after training stopped Here's the final audio sample Another question, I was reading AlignTTS paper and they were able to train on 2x Tesla V100 GPUs with Batch size 16 whereas I am unable to train on 4x same GPUs with anything more than batch size 8, any ideas? |
Beta Was this translation helpful? Give feedback.
-
I think your spectrogram parameters are borken. Even the ground truth specs look unusual. Can you post your audio parameter from your config? |
Beta Was this translation helpful? Give feedback.
-
Also try this for the audio
|
Beta Was this translation helpful? Give feedback.
-
how large is your dataset? |
Beta Was this translation helpful? Give feedback.
-
I am training AlignTTS on my custom dataset that follows the same dataset as the first speaker of VCTK dataset p225 (Mono, 48KHz)
I am training with the default learning rate
1e-4
from thealigntts_transformers.json
I just wanted to check if the training below at 2375 epoch out of 10K is going good, batch size 8 on 4x GPUs
is there anything to change?
Eval stats
Training set stats
Beta Was this translation helpful? Give feedback.
All reactions