Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: cache btree sub-index pages #3309

Merged
merged 2 commits into from
Dec 27, 2024
Merged

Conversation

westonpace
Copy link
Contributor

@westonpace westonpace commented Dec 27, 2024

The btree currently only caches the the lookup. This means there will always need to be at least one IOP per search 4Ki rows that match the filter to load the sub-index (flat) data.

This PR adds a cache for the data pages as well.

The cache defaults to 512MiB and is technically configurable via LANCE_BTREE_CACHE_SIZE but this is just a stop-gap. We should ideally have a single size limit for all scalar indices (or, even better, a single size limit for ALL indices). However, this is a bit tricky. I don't think moka caches can share capacity across several caches and using a single moka cache for all places we need cache is tricky because moka caches are typed (in some places we use Arc<dyn Any> which might be a possible solution though we still need to deal with the fact that we key the different caches by u32, String, and object_store::Path).

We can resolve this technical debt in #3136

@github-actions github-actions bot added enhancement New feature or request python labels Dec 27, 2024
@westonpace
Copy link
Contributor Author

See https://github.com/lancedb/lance/actions/runs/12517440943 for a test of the newly introduced CI benchmark

@codecov-commenter
Copy link

codecov-commenter commented Dec 27, 2024

Codecov Report

Attention: Patch coverage is 82.22222% with 8 lines in your changes missing coverage. Please review.

Project coverage is 79.03%. Comparing base (3a47444) to head (be88580).

Files with missing lines Patch % Lines
rust/lance-index/src/scalar/btree.rs 82.22% 4 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3309      +/-   ##
==========================================
+ Coverage   79.02%   79.03%   +0.01%     
==========================================
  Files         246      246              
  Lines       87589    87628      +39     
  Branches    87589    87628      +39     
==========================================
+ Hits        69217    69259      +42     
+ Misses      15506    15502       -4     
- Partials     2866     2867       +1     
Flag Coverage Δ
unittests 79.03% <82.22%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@westonpace westonpace merged commit 38a0a92 into lancedb:main Dec 27, 2024
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants