You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would love to see support for jina-embedding-v3 in optimum.
Right now you can't use this model with ORTModelForFeatureExtraction since it scrubs the required task_id field from the input during the forward pass resulting in a KeyError here.
Additionally, the model is incompatible with BetterTransformer, throwing the following error:
BetterTransformer is not available for model: <class 'transformers_modules.jinaai.xlm-roberta-flash-implementation.2b6bc3f30750b3a9648fe9b63448c09920efe9be.modeling_lora.XLMRobertaLoRA'> The transformation of the model XLMRobertaLoRA to BetterTransformer failed while it should not. Please fill a bug report or open a PR to support this model at https://github.com/huggingface/optimum/. Continue without bettertransformer modeling code.
Adding support for jina-embeddings-v3 in either of the ways above would be highly valuable for myself and many others trying to serve this model efficiently. Your efforts would be greatly appreciated!
Motivation
Currently, jina-embeddings-v3 is one of the best multilingual embedding models under 1B parameters, with over 1M downloads on Hugging Face last month. However, its unique architecture (XLMRobertaLoRA) makes it incompatible with optimum, which creates significant challenges for efficient serving.
Libraries like infinity, which implement async batching for optimized inference, rely on optimum. This limitation makes serving jina-embeddings-v3 both difficult and slow, despite its exceptional performance.
Your contribution
I would be more than willing to submit a PR, I would just want some direction as to where to add the changes. I think passing through a task_id param at the forward pass of the ORTModel would be an easy patch, but I differ you y'all!
The text was updated successfully, but these errors were encountered:
Feature request
I would love to see support for
jina-embedding-v3
inoptimum
.Right now you can't use this model with ORTModelForFeatureExtraction since it scrubs the required
task_id
field from the input during the forward pass resulting in a KeyError here.Additionally, the model is incompatible with
BetterTransformer
, throwing the following error:Adding support for
jina-embeddings-v3
in either of the ways above would be highly valuable for myself and many others trying to serve this model efficiently. Your efforts would be greatly appreciated!Motivation
Currently,
jina-embeddings-v3
is one of the best multilingual embedding models under 1B parameters, with over 1M downloads on Hugging Face last month. However, its unique architecture (XLMRobertaLoRA
) makes it incompatible withoptimum
, which creates significant challenges for efficient serving.Libraries like
infinity
, which implement async batching for optimized inference, rely onoptimum
. This limitation makes servingjina-embeddings-v3
both difficult and slow, despite its exceptional performance.Your contribution
I would be more than willing to submit a PR, I would just want some direction as to where to add the changes. I think passing through a
task_id
param at the forward pass of the ORTModel would be an easy patch, but I differ you y'all!The text was updated successfully, but these errors were encountered: