New speaker encoder #447
Unanswered
loganhart02
asked this question in
General Q&A
Replies: 2 comments 9 replies
-
I think it'd be definitely valuable for some use cases. Current speaker encoder is trained by @mueller91, so he might comment on it better. |
Beta Was this translation helpful? Give feedback.
0 replies
-
For a good speaker encoder, a high number of different speakers in the dataset is critical. How many speakers does your dataset contain? |
Beta Was this translation helpful? Give feedback.
9 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hey guys! I've been working on trying to improve the multi-speaker models and I got access to a spotify podcast dataset(It is 2TB of several podcast). I haven't really got dirty with messing around with it yet because of the startup I'm founding so I'm not sure how many speakers there really are but I know that these are full length podcast and from what I've listened to they are of good quality too. Do you guys think that if we trained a new speaker encoder that it would improve on the current one to get a better multi-speaker TTS system? Wanted to ask before I dived in deep into trying to preprocess this massive dataset.
Beta Was this translation helpful? Give feedback.
All reactions