Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polish TTS pronunciation #474

Open
xdax1 opened this issue Jan 2, 2025 · 12 comments
Open

Polish TTS pronunciation #474

xdax1 opened this issue Jan 2, 2025 · 12 comments

Comments

@xdax1
Copy link

xdax1 commented Jan 2, 2025

Hello, is it possible to add Applio TTS to alltalktts?
I care most about the API, which applio unfortunately does not have and I do not know how to create one.

@erew123
Copy link
Owner

erew123 commented Jan 2, 2025

Technically speaking Applio is not a TTS engine in and of itself, just like AllTalk is not a TTS engine in and of itself. They would be classed as competing products, like saying Windows, Linux and Mac OS. So your question is like asking if I can put Windows in Mac OS.

If there is a specific feature of it you are wanting/asking for? Here is what is currently planned #74 and the next release will be having RVC Training and also Opensea and also openvoice. Outside of those if you can tell me what it is you are asking for, then perhaps I can look at it at some point.

@xdax1
Copy link
Author

xdax1 commented Jan 2, 2025

@erew123 Oh, I thought Applio was just another model for creating TTS and would be able to implement it into alltalktts just like piper, vits, etc. The problem I have with all models is that in Polish, English words like names of people, names of places and any words eng says as it is written and not as it is pronounced.
In apolio it pronounces everything much better, but there is no API there and I don't have the possibility with my script toautomatize the creation of a voiceover.
That's why I thought there was a possibility to add it as a choice of another model and use the API that alltalktts has.

@erew123
Copy link
Owner

erew123 commented Jan 5, 2025

Have you tried with the Polish specific TTS voices in Piper or VITS:

image

image

Do they give you any better results?

@xdax1
Copy link
Author

xdax1 commented Jan 5, 2025

Yes, I checked every engine and in all of them is the same problem.
In piper or vits I also have a problem because I don't know how to upload my voice there (but I tested on the current voices), in xtts I trained a model but it is a different type than in piper or vits.
I don't know exactly how to deal with it, in all TTS engines there is the same problem.
For example, the voiceover pronounces the name “Jonny” as jonny (as it is written, not as it should be pronounced in eng) and not “Dzoni”

@erew123
Copy link
Owner

erew123 commented Jan 5, 2025

With Piper and VITS, because the models/voices you load (of those models shown above) they should automatically shift into polish (aka, you cannot specify the language for them, they just are the language you load in).

With XTTS, you would need to specify pl as the language so it uses the correct tokenizer settings. Im not saying you havnt done that/tried that, Im just covering off all bases here.

image

Appilio uses Edge TTS. which you can test the voices here https://huggingface.co/spaces/innoai/Edge-TTS-Text-to-Speech

Does that TTS engine do what you want?

@erew123
Copy link
Owner

erew123 commented Jan 5, 2025

@xdax1 just to be clear, reason I am asking if it does what you want, is I may be able to add that engine quite easily (when I get chance)

@xdax1
Copy link
Author

xdax1 commented Jan 5, 2025

Yes! I checked this edge and most of the words eng says correctly. And will I be able to add other voices there?

@erew123
Copy link
Owner

erew123 commented Jan 5, 2025

So edge TTS, the actual TTS engine I think is hard coded voices.

However, beyond that you can use RVC to alter the pitch to make it sound like someone specific and train a voice into that, though its not going to change the pronunciation of words, thats down to the underlying TTS engine.

RVC is a voice changer/transformer so makes speech or TTS sound like another person.

I have working code to train your own RVC voice, but I havnt had time to finish it just yet as its been a large code base update. I will also be adding Openvoice, which is the same as RVC but requires no pre-training and will just use a wav/audio file, like XTTS does. Again though the underlying TTS engine will need to pronounce things the way you want them spoken for your language, hence me asking is Edge is doing what you want with polish.

@xdax1
Copy link
Author

xdax1 commented Jan 5, 2025

Yes, Edge works.
So I'm waiting for the update, good luck :)

@S-T-K
Copy link

S-T-K commented Jan 7, 2025

@erew123 Any ETA on the next update? RVC model training and Openvoice are 2 incredibly amazing features, wow! Can't wait to try them. And RVC training would actually come in handy right now, otherwise I would have to do that in Applio for the time being.

Oh, and thank you so much for your incredible work! Alltalk is unbelievably useful and polished!

@erew123
Copy link
Owner

erew123 commented Jan 7, 2025

@S-T-K No hard date yet. Hopefully before the end of the month, but had to leave coding pre Christmas to deal with a house break-in. Didnt get back until a couple of days ago, Im trying to catch up with PR's & support issues and then I have to try get my head back into where I was in the code pre christmas, with one additional problem that I have to travel again on the 10th and that will be at least 7 days away. The code is 80-85% done, but I cant recall exactly where Im at with it and that final 10-15% can always be a sod to complete when something goes wrong. So by the end of the month (hopefully).

@erew123 erew123 changed the title applio TTS Polish TTS pronunciation Jan 7, 2025
@S-T-K
Copy link

S-T-K commented Jan 7, 2025

Wow that's soon! Guess I'll shuffle around some task to postpone the RVC part of my project till then, good to know.
And sorry about the break-in, damn. Hope you're fine!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants