Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Colab for Synthesis #6

Open
athenasaurav opened this issue Mar 26, 2024 · 10 comments
Open

Colab for Synthesis #6

athenasaurav opened this issue Mar 26, 2024 · 10 comments

Comments

@athenasaurav
Copy link

Hello Everyone,

Here is the Colab for synthesis.

@yiwei0730
Copy link

@athenasaurav Can I ask if I want to use anyone's voice, how to obtain TextGrid alignment?

@athenasaurav
Copy link
Author

Hello @yiwei0730

You can use MFA to do this. Please read this blog

@athenasaurav athenasaurav reopened this Mar 29, 2024
@yiwei0730
Copy link

Thanks for your reply, that means using MFA datasets_align.sh
Run according to this (just use the same method as before with FS2)

@ex3ndr
Copy link
Owner

ex3ndr commented Apr 1, 2024 via email

@yiwei0730
Copy link

I'm interested in his audio-prompt-free automatic sound generation
I would like to ask where he produces/samples unique sound features when I don't give him the required sound prompts. Can you point it out to me?
I want to know if after production, if i think the sound is great, i can repeatedly extract this feature parameter for use and synthesize this sound to another sentence.

@ex3ndr
Copy link
Owner

ex3ndr commented Apr 2, 2024 via email

@yiwei0730
Copy link

yiwei0730 commented Apr 2, 2024

Yes you can do this, just zero out required prompts (do not provide any) and you will get random voices which you can later use as prompt. Steve Korshakov Sent via Superhuman @.> On Mon, Apr 1 2024 at 6:19 PM, yiwei0730 @.@.>> wrote: I'm interested in his audio-prompt-free automatic sound generation I would like to ask where he produces/samples unique sound features when I don't give him the required sound prompts. Can you point it out to me? I want to know if after production, if i think the sound is great, i can repeatedly extract this feature parameter for use and synthesize this sound to another sentence. — Reply to this email directly, view it on GitHub<#6 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AADB2E2PBWROIPNO4SLYPDLY3IBRLAVCNFSM6AAAAABFJBOKPSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZQHEYDKMJVGQ. You are receiving this because you commented.Message ID: @.>

Good idea, I never thought that the new voice could be directly used as a prompt hahaha.
I noticed that voices has four files to setting: TextGrid, pt, txt, wav
TextGrid generates txt through MFA. It can be generated from recognition. Where should the pt file be generated?

@ex3ndr
Copy link
Owner

ex3ndr commented Apr 2, 2024 via email

@yiwei0730
Copy link

@ex3ndr Did you mean the created file is this ? https://github.com/ex3ndr/supervoice/blob/master/generate_voices.py

@ex3ndr
Copy link
Owner

ex3ndr commented Apr 2, 2024

Yes, you need MFA, but you don't need alignment for full dataset, you can just run on files from your samples.

Yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants