Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple LRU cache for reading blocks #224

Merged
merged 1 commit into from
Sep 24, 2023

Conversation

sourcefrog
Copy link
Owner

Helps a lot with S3 performance

Helps a lot with S3 performance
@linear
Copy link

linear bot commented Sep 24, 2023

CON-20 Cache decompressed block content

#179

I think it's caused by repeatedly reading, decompressing and hashing block files.

A few options, in ascending complexity:

  1. Hold decompressed blocks in RAM, with some LRU expiry.
  2. Keep the uncompressed content on disk, then we can read just the desired section more cheaply, and have less need to worry about using too much RAM. (Although, it might use more temp disk than is available, so they might still need to expire.)
  3. A tiered cache of both RAM and disk
  4. Most complex but arguably optimal: If we read the whole index in advance, applying any filters, then we'd know exactly which blocks to read in what order, and how long they need to be kept in memory. We could even restore in an order chosen to optimize reuse. Does require potentially reading the whole index up front, or having a more complex sliding window through the index.

A related issue is to remember which blocks are even present, maybe by listing the block dir before starting backups, or at least remembering which ones we've seen before. We could then avoid repeatedly probing the filesystem. Does it really matter though?

We could use https://docs.rs/caches/0.2.4/caches/index.html

This seems to work well with Bytes to share the buffers.

https://docs.rs/caches/latest/caches/lru/struct.AdaptiveCache.html#method.new

@sourcefrog sourcefrog merged commit fe28315 into main Sep 24, 2023
@sourcefrog sourcefrog deleted the con-20-cache-decompressed-block-content branch September 24, 2023 03:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant