feat: support sparse text embeddings #273

joshua-mo-143 · 2025-02-05T23:37:20Z

I have looked for existing issues (including closed) about this

Feature Request

Following on from #268, we should try to support sparse text model embeddings.

This will enable users to carry out things like hybrid search for RAG.

Motivation

Sparse embeddings are quite good for some types of NLP and information search.

They are also quite efficient and small which makes them good storage-wise.

Proposal

Unfortunately, Fastembed does not support sparse text models by default despite apparently having it in the library - you have to create your own.

This issue however, points to the bm42-rs library which does support it. So we can implement it this way if we are OK with adding another library. (we'll likely need to do some exploratory work to figure this part out)

Alternatives

Not sure?

joshua-mo-143 changed the title ~~feat: support sparse text model embeddings~~ feat: support sparse text embeddings Feb 5, 2025

joshua-mo-143 added the non-breaking label Feb 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support sparse text embeddings #273

feat: support sparse text embeddings #273

joshua-mo-143 commented Feb 5, 2025 •

edited

Loading

feat: support sparse text embeddings #273

feat: support sparse text embeddings #273

Comments

joshua-mo-143 commented Feb 5, 2025 • edited Loading

Feature Request

Motivation

Proposal

Alternatives

joshua-mo-143 commented Feb 5, 2025 •

edited

Loading