Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IVF-PQ: low-precision coarse search #715

Open
wants to merge 2 commits into
base: branch-25.04
Choose a base branch
from

Conversation

achirkin
Copy link
Contributor

Enable low-precision (half / int8) element type for use in the cuBLAS GEMM performed during coarse search (select clusters to probe). This makes cuBLAS use tensor cores and thus speeds up the coarse search.

Also propagate kMaxQueries compile time constant to a runtime search parameter: this allows to improve GPU utilization in extremely large batch size case, such as using IVF-PQ for constructing a nearest-neighbor graph for the whole dataset.

@achirkin achirkin added feature request New feature or request non-breaking Introduces a non-breaking change labels Feb 21, 2025
@achirkin achirkin self-assigned this Feb 21, 2025
@achirkin achirkin requested review from a team as code owners February 21, 2025 08:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake cpp feature request New feature or request non-breaking Introduces a non-breaking change
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

1 participant