Serve UAE embed model in OpenAI compatible server? #1045
Replies: 3 comments
-
|
Beta Was this translation helpful? Give feedback.
-
ok looks like So... what are y'all using for embeddings? You just pick a random 7b? Bert models top the MTEB chart. |
Beta Was this translation helpful? Give feedback.
-
I found ggerganov/llama.cpp#2872 which linked me to https://github.com/xyzhang626/embeddings.cpp Maybe I am pitching embeddings.cpp bindings in this project at least until llama.cpp gets official bert support from that first thread. |
Beta Was this translation helpful? Give feedback.
-
I was excited to see we can now serve multiple models from a single instance of the OpenAI compatible endpoint! I was excited to try to serve a dedicated embedding model so I could keep my embeddings consistent while swapping out for an arbitrary completion model.
How can I serve this model? I tried to convert it to gguf but got an error I'll share later. Maybe this is a llama CPP question...
https://huggingface.co/WhereIsAI/UAE-Large-V1
Beta Was this translation helpful? Give feedback.
All reactions