Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(WIP) Dynamic inference provider mapping #1173

Merged
merged 25 commits into from
Feb 6, 2025

Conversation

Wauplin
Copy link
Contributor

@Wauplin Wauplin commented Feb 4, 2025

"it should work" 😄

Still a draft while https://github.com/huggingface-internal/moon-landing/pull/12398 (internal) is been merged/deployed.

Goal is to use the dynamic mapping, and default back to hardcoded model ids if necessary (for backward compatibility).

I haven't tested anything for now and I left some todos to address:

  • what to do with status: "live" | "staging" ? => raise a warning
  • we need to cache the modelInfo call (only do it once at runtime)
  • how to handle if model supports both text-generation and conversational
  • how to deal with "hf-inference" if no taskHints? (for now, I kept as before)
  • tests are flaky (requires server-side update)

EDIT: made an update to preserve previous behavior with hardcoded mapping. Dynamic mapping from the hub
takes precedence.


type FalAiId = string;

export const FAL_AI_SUPPORTED_MODEL_IDS: ProviderMapping<FalAiId> = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as in https://github.com/huggingface/huggingface_hub/pull/2836/files#r1942755169, i would for now keep the mappings (but make them empty)
and still support them in code

That way we have slightly less breaking changes too, potentially

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@julien-c I reverted to keep both the hard-coded logic and the new dynamic mapping. Mapping from the Hub takes precedence over the hardcoded one.

@SBrandeis
Copy link
Contributor

Depends on https://github.com/huggingface-internal/moon-landing/pull/12453 (internal)

`Model ${params.model} is in staging mode for provider ${params.provider}. Meant for test purposes only.`
);
}
// TODO: how is it handled server-side if model has multiple tasks (e.g. `text-generation` + `conversational`)?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO: how is it handled server-side if model has multiple tasks (e.g. `text-generation` + `conversational`)?

i think this is ok @Wauplin

Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this should be ready to review + to merge

let's try to merge quickly because otherwise PRs will be opened to add stuff to the mappings

@SBrandeis SBrandeis self-assigned this Feb 6, 2025
@julien-c
Copy link
Member

julien-c commented Feb 6, 2025

BG @SBrandeis

image

@SBrandeis SBrandeis marked this pull request as ready for review February 6, 2025 11:51
@SBrandeis SBrandeis merged commit 38d13dd into main Feb 6, 2025
5 checks passed
@SBrandeis SBrandeis deleted the dynamic-inference-provider-mapping branch February 6, 2025 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants