Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for Kokoro via Replicate #1153

Merged
merged 9 commits into from
Feb 4, 2025
Merged

add support for Kokoro via Replicate #1153

merged 9 commits into from
Feb 4, 2025

Conversation

Vaibhavs10
Copy link
Member

No description provided.

@@ -22,6 +22,7 @@ export const REPLICATE_SUPPORTED_MODEL_IDS: ProviderMapping<ReplicateId> = {
},
"text-to-speech": {
"OuteAI/OuteTTS-0.3-500M": "jbilcke/oute-tts:39a59319327b27327fa3095149c5a746e7f2aee18c75055c3368237a6503cd26",
"hexgrad/Kokoro-82M": "jaaari/kokoro-82m:dfdf537ba482b029e0a761699e6f55e9162cfd159270bfe0e44857caa5f275a6"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to add a test but you'll see, you'll need to change the param name

I was meaning to ask @jbilcke-hf to update the param name in jbilcke/oute-tts to text instead of inputs

@hanouticelina
Copy link
Contributor

hanouticelina commented Jan 29, 2025

Not sure if this will work, there is an input payload schema mismatch between text-to-speech models in Replicate.
While jbilcke/oute-tts expects the prompt in input["inputs"], jaaari/kokoro-82m requires it in input["text"] and the implementation currently uses input["inputs"] (see here).

Copy link
Member Author

@Vaibhavs10 Vaibhavs10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was too easy to be true, yeah will update the cog ID to a different to match the spec!

@julien-c
Copy link
Member

yes, imo you can remove the oute-tts one for now (but keep the test as skipped and ask if @jbilcke-hf can update his replicate model in the future) and change the param name to text

@julien-c
Copy link
Member

https://replicate.com/jbilcke/oute-tts was not Warm anyways so not great UX (a bit slow to start)

@hanouticelina
Copy link
Contributor

hanouticelina commented Jan 29, 2025

yes, imo you can remove the oute-tts one for now (but keep the test as skipped and ask if @jbilcke-hf can update his replicate model in the future) and change the param name to text

I'll take care of doing the equivalent Python side. Also, it seems to me that we have more text-to-speech models using the text parameter instead of inputs, eg: we will be able to add xtts-v2 (replicate link, HF link) as well.
EDIT: xtts-v2 does not support the same API either, since it's a text-to-speech voice cloning model.

Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, modulo my comment

packages/inference/src/providers/replicate.ts Outdated Show resolved Hide resolved
const res = await client.textToSpeech({
model: "hexgrad/Kokoro-82M",
provider: "replicate",
text: "Kokoro is a frontier TTS model for its size of 1 Billion parameters",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it accept parameters?
Would be nice to add some to make sure they payload has the proper shape

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but it does not accept any of the parameters listed in TextToSpeechParameters (see the schema of the model here). Since each model on Replicate runs in its own docker container, models of the same task can have different parameters (and even more problematic: different input/output format). In huggingface_hub, we added an argument extra_body to allow users to pass any provider- or model-specific parameters.

Co-authored-by: Simon Brandeis <[email protected]>
Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll merge this now, to be able to test it on the Hub 🔥

@julien-c julien-c merged commit c6a1c7e into main Feb 4, 2025
3 of 5 checks passed
@julien-c julien-c deleted the vb/replicate-kokoro branch February 4, 2025 14:57
@julien-c
Copy link
Member

julien-c commented Feb 4, 2025

oops, i did not add the VCR tape => #1171

julien-c added a commit that referenced this pull request Feb 4, 2025
julien-c added a commit that referenced this pull request Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants