One phonemes pronunciation not matching dataset #257

JRMeyer · 2021-03-07T08:51:50Z

JRMeyer
Mar 7, 2021
Maintainer

>>> baconator
[August 16, 2020, 10:03pm]

I have a 14,000 sentence dataset (clean audio, single speaker, correctly
transcribed) that I'm trying to model from. It aligned by 10k steps, now
just past 90k. For the most part it's sounding good.

Words with 'ah' or ending with long 'a' tend to have a weird rolled-r
sound after (it's like pirate speak, but unwanted). I've listened to
sentences in the dataset, and added a few to the test sentences,
including words spoken correctly in the source, and they come out as
'ar' when generated. 'Athena sprang from the head of Zeus' would end up
sounding like 'Arthenar sprang...'

I should also add I've trained the same config parameters (other than
dataset) with LJ and not had this issue.

Start over? Adjust the files we're using in the dataset? Add even more
sentences with the correct pronunciations? Everything else seems to be
good, even extremely long sentences.

[This is an archived TTS discussion thread from discourse.mozilla.org/t/one-phonemes-pronunciation-not-matching-dataset]

JRMeyer · 2021-03-07T08:51:52Z

JRMeyer
Mar 7, 2021
Maintainer Author

>>> georroussos
[August 17, 2020, 8:18am]

Hi, what language is the set? Is it English? What dialect if English?

[Archived Post]

0 replies

JRMeyer · 2021-03-07T08:51:55Z

JRMeyer
Mar 7, 2021
Maintainer Author

>>> baconator
[August 17, 2020, 8:21am]

English, American, I'd call it General American English.

[Archived Post]

0 replies

JRMeyer · 2021-03-07T08:51:57Z

JRMeyer
Mar 7, 2021
Maintainer Author

>>> georroussos
[August 17, 2020, 9:13am]

Weird that it does it. Does it happen anywhere in the recordings? What
do you get if you try to phonemize the sentence that is synthesized
wrong?

Try running echo 'hello world' | phonemize -l en-us -b espeak

[Archived Post]

0 replies

JRMeyer · 2021-03-07T08:52:00Z

JRMeyer
Mar 7, 2021
Maintainer Author

>>> nmstoker
[August 17, 2020, 1:03pm]

's method seems
like the best way to determine if there's anything odd going on with the
phonemes produced for those test sentences. Confirming good phonemes for
some of your more common words in the training set that start or end in
'a' would also be worth doing just to double check no odd behaviour on
that side too.

- I assume your config
is definitely set to train with phonemes? (not just direct from letters)

[Archived Post]

0 replies

JRMeyer · 2021-03-07T08:52:02Z

JRMeyer
Mar 7, 2021
Maintainer Author

>>> baconator
[August 17, 2020, 4:11pm]

Phonemize output looks correct, and yes, using phonemes.

Going to restart from scratch, see how it goes.

[Archived Post]

0 replies

JRMeyer · 2021-03-07T08:52:05Z

JRMeyer
Mar 7, 2021
Maintainer Author

>>> baconator
[August 24, 2020, 11:05pm]

Pulled down current commit of the repo, which has commented out the
vocab/phoneme section, and will try it with that. Also yes, I've been
clearing the phoneme cache as well between runs.

ETA for anyone from the future: slash
Using the updated repo and config file, I tried first with LJ (worked as
expected) and private dataset, and keeping the the LJ phonemes. This
seems to work so far (100k).

[Archived Post]

0 replies

JRMeyer · 2021-03-07T08:52:08Z

JRMeyer
Mar 7, 2021
Maintainer Author

>>> nmstoker
[August 25, 2020, 12:00am]

Glad it appears to be working

I suspect that the keeping of the LJ phoneme files wouldn't have had an
effect. The caching process saves them with filenames corresponding to
the original wav filename but with an .npy extension.

If you'd had a situation where the filenames from LJ Speech overlapped
with those from your dataset then it would likely have messed things up
(because it would read them in thinking they would be phonemes for the
equivalent audio but would get some completely unrelated phonemes) slash
And if the filenames differed then they would coexist in the cache
directory and when processing your dataset it wouldn't read the ones for
the LJ Speech filenames.

[Archived Post]

0 replies

JRMeyer · 2021-03-07T08:52:10Z

JRMeyer
Mar 7, 2021
Maintainer Author

>>> baconator
[August 25, 2020, 1:13am]

Hm. The updated repo was the only other thing I changed. Interesting.

[Archived Post]

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

One phonemes pronunciation not matching dataset #257

{{title}}

Replies: 8 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

One phonemes pronunciation not matching dataset #257

JRMeyer Mar 7, 2021 Maintainer

Replies: 8 comments

JRMeyer Mar 7, 2021 Maintainer Author

JRMeyer Mar 7, 2021 Maintainer Author

JRMeyer Mar 7, 2021 Maintainer Author

JRMeyer Mar 7, 2021 Maintainer Author

JRMeyer Mar 7, 2021 Maintainer Author

JRMeyer Mar 7, 2021 Maintainer Author

JRMeyer Mar 7, 2021 Maintainer Author

JRMeyer Mar 7, 2021 Maintainer Author

JRMeyer
Mar 7, 2021
Maintainer

JRMeyer
Mar 7, 2021
Maintainer Author

JRMeyer
Mar 7, 2021
Maintainer Author

JRMeyer
Mar 7, 2021
Maintainer Author

JRMeyer
Mar 7, 2021
Maintainer Author

JRMeyer
Mar 7, 2021
Maintainer Author

JRMeyer
Mar 7, 2021
Maintainer Author

JRMeyer
Mar 7, 2021
Maintainer Author

JRMeyer
Mar 7, 2021
Maintainer Author