Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for jina-embeddings-v3 #2166

Open
arianomidi opened this issue Jan 21, 2025 · 0 comments
Open

Support for jina-embeddings-v3 #2166

arianomidi opened this issue Jan 21, 2025 · 0 comments

Comments

@arianomidi
Copy link

Feature request

I would love to see support for jina-embedding-v3 in optimum.

Right now you can't use this model with ORTModelForFeatureExtraction since it scrubs the required task_id field from the input during the forward pass resulting in a KeyError here.

Additionally, the model is incompatible with BetterTransformer, throwing the following error:

BetterTransformer is not available for model: <class 'transformers_modules.jinaai.xlm-roberta-flash-implementation.2b6bc3f30750b3a9648fe9b63448c09920efe9be.modeling_lora.XLMRobertaLoRA'> The transformation of the model XLMRobertaLoRA to BetterTransformer failed while it should not. Please fill a bug report or open a PR to support this model at https://github.com/huggingface/optimum/. Continue without bettertransformer modeling code.

Adding support for jina-embeddings-v3 in either of the ways above would be highly valuable for myself and many others trying to serve this model efficiently. Your efforts would be greatly appreciated!

Motivation

Currently, jina-embeddings-v3 is one of the best multilingual embedding models under 1B parameters, with over 1M downloads on Hugging Face last month. However, its unique architecture (XLMRobertaLoRA) makes it incompatible with optimum, which creates significant challenges for efficient serving.

Libraries like infinity, which implement async batching for optimized inference, rely on optimum. This limitation makes serving jina-embeddings-v3 both difficult and slow, despite its exceptional performance.

Your contribution

I would be more than willing to submit a PR, I would just want some direction as to where to add the changes. I think passing through a task_id param at the forward pass of the ORTModel would be an easy patch, but I differ you y'all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant