Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support sparse text embeddings #273

Open
1 task done
joshua-mo-143 opened this issue Feb 5, 2025 · 0 comments
Open
1 task done

feat: support sparse text embeddings #273

joshua-mo-143 opened this issue Feb 5, 2025 · 0 comments

Comments

@joshua-mo-143
Copy link
Contributor

joshua-mo-143 commented Feb 5, 2025

  • I have looked for existing issues (including closed) about this

Feature Request

Following on from #268, we should try to support sparse text model embeddings.

This will enable users to carry out things like hybrid search for RAG.

Motivation

Sparse embeddings are quite good for some types of NLP and information search.

They are also quite efficient and small which makes them good storage-wise.

Proposal

Unfortunately, Fastembed does not support sparse text models by default despite apparently having it in the library - you have to create your own.

This issue however, points to the bm42-rs library which does support it. So we can implement it this way if we are OK with adding another library. (we'll likely need to do some exploratory work to figure this part out)

Alternatives

Not sure?

@joshua-mo-143 joshua-mo-143 changed the title feat: support sparse text model embeddings feat: support sparse text embeddings Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant