-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add ONNX & OpenVINO backend support, and torch dtype kwargs in Sentence Transformers Components #8813
base: main
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 13208168523Details
💛 - Coveralls |
Current failing tests are due to #8811. PR is still WIP but the tests pass locally. |
Okay, after taking a look at how we could possibly do At this point, I will undraft this PR and take any suggestions/reviews for what I should fix or modify. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great work!
Just a question: what happens if the user selects a specific backend and the required dependencies are not installed?
Sentence Transformers will raise an exception instructing the user to pip install the right package:
Should we try to catch these errors? I imagine we will throw an exception the same way Sentence Transformers does. I don't think we should fall back to the torch backend because we can potentially load a model that eats up too much CPU or RAM which may affect the user's systems. |
Related Issues
Proposed Changes:
Sentence Transformer based components now expose a backend parameter that allows the user to specify a different backend besides the default of torch. Supported backends are onnx and openvino. Documentation for these backends can be found here.
How did you test it?
Integration tests were added for each of the supported backends. I am making this PR early on to have the CI assist with some tests as I can not run them locally.
Notes for the reviewer
onnx
andopenvino
backends at the same time due to a limitation withoptimum-intel[openvino]==1.21.0
which does not supporttransformers>=4.47
. The nextoptimum-intel
update should add support fortransformers>=4.47
. For now,openvino
tests are skipped.onnx-gpu
supportSentenceTransformersDiversityRanker
dtype
quantizationChecklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
and added!
in case the PR includes breaking changes.