Replies: 10 comments
-
>>> baconator |
Beta Was this translation helpful? Give feedback.
-
>>> nmstoker |
Beta Was this translation helpful? Give feedback.
-
>>> rdh |
Beta Was this translation helpful? Give feedback.
-
>>> dkreutz |
Beta Was this translation helpful? Give feedback.
-
>>> nmstoker |
Beta Was this translation helpful? Give feedback.
-
>>> othiele |
Beta Was this translation helpful? Give feedback.
-
>>> nmstoker |
Beta Was this translation helpful? Give feedback.
-
>>> rdh |
Beta Was this translation helpful? Give feedback.
-
>>> georroussos |
Beta Was this translation helpful? Give feedback.
-
>>> rdh |
Beta Was this translation helpful? Give feedback.
-
>>> rdh
[July 7, 2020, 5:47pm]
I
decided to create my own dataset as well. Starting from my own desk with
a crappy headset microphone, I soon moved on to more professional
methods.
In the end I hired two male voice talents, they will each provide me
with 20-25 hours of Belgian Dutch voice data over the course of the
coming two months. My aim is to create other voices from this data as
well, hopefully with a minimum of data. I asked them to record in mono
WAV format, 44.1 kHz and 16-bit audio.
Should I train two separate tacatron2 models, check which one is most
suitable, and use transfer learning or is the state of the current
multi-speaker training good enough and easier to work with to generate
future voices?
Are there any other tips or suggestions which I should think about?
Any help or input is appreciated.
[This is an archived TTS discussion thread from discourse.mozilla.org/t/multispeaker-versus-transfer-learning]
Beta Was this translation helpful? Give feedback.
All reactions