feat: cache btree sub-index pages #3309
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The btree currently only caches the the lookup. This means there will always need to be at least one IOP per search 4Ki rows that match the filter to load the sub-index (flat) data.
This PR adds a cache for the data pages as well.
The cache defaults to 512MiB and is technically configurable via
LANCE_BTREE_CACHE_SIZE
but this is just a stop-gap. We should ideally have a single size limit for all scalar indices (or, even better, a single size limit for ALL indices). However, this is a bit tricky. I don't think moka caches can share capacity across several caches and using a single moka cache for all places we need cache is tricky because moka caches are typed (in some places we useArc<dyn Any>
which might be a possible solution though we still need to deal with the fact that we key the different caches byu32
,String
, andobject_store::Path
).We can resolve this technical debt in #3136