Skip to content

Commit

Permalink
Rewrite memory-charging feature's option API (#9926)
Browse files Browse the repository at this point in the history
Summary:
**Context:**
Previous PR facebook/rocksdb#9748, facebook/rocksdb#9073, facebook/rocksdb#8428 added separate flag for each charged memory area. Such API design is not scalable as we charge more and more memory areas. Also, we foresee an opportunity to consolidate this feature with other cache usage related features such as `cache_index_and_filter_blocks` using `CacheEntryRole`.

Therefore we decided to consolidate all these flags with `CacheUsageOptions cache_usage_options` and this PR serves as the first step by consolidating memory-charging related flags.

**Summary:**
- Replaced old API reference with new ones, including making `kCompressionDictionaryBuildingBuffer` opt-out and added a unit test for that
- Added missing db bench/stress test for some memory charging features
- Renamed related test suite to indicate they are under the same theme of memory charging
- Refactored a commonly used mocked cache component in memory charging related tests to reduce code duplication
- Replaced the phrases "memory tracking" / "cache reservation" (other than CacheReservationManager-related ones) with "memory charging" for standard description of this feature.

Pull Request resolved: facebook/rocksdb#9926

Test Plan:
- New unit test for opt-out `kCompressionDictionaryBuildingBuffer` `TEST_F(ChargeCompressionDictionaryBuildingBufferTest, Basic)`
- New unit test for option validation/sanitization `TEST_F(CacheUsageOptionsOverridesTest, SanitizeAndValidateOptions)`
- CI
- db bench (in case querying new options introduces regression) **+0.5% micros/op**: `TEST_TMPDIR=/dev/shm/testdb ./db_bench -benchmarks=fillseq -db=$TEST_TMPDIR  -charge_compression_dictionary_building_buffer=1(remove this for comparison)  -compression_max_dict_bytes=10000 -disable_auto_compactions=1 -write_buffer_size=100000 -num=4000000 | egrep 'fillseq'`

#-run | (pre-PR) avg micros/op | std micros/op | (post-PR)  micros/op | std micros/op | change (%)
-- | -- | -- | -- | -- | --
10 | 3.9711 | 0.264408 | 3.9914 | 0.254563 | 0.5111933721
20 | 3.83905 | 0.0664488 | 3.8251 | 0.0695456 | **-0.3633711465**
40 | 3.86625 | 0.136669 | 3.8867 | 0.143765 | **0.5289363078**

- db_stress: `python3 tools/db_crashtest.py blackbox  -charge_compression_dictionary_building_buffer=1 -charge_filter_construction=1 -charge_table_reader=1 -cache_size=1` killed as normal

Reviewed By: ajkr

Differential Revision: D36054712

Pulled By: hx235

fbshipit-source-id: d406e90f5e0c5ea4dbcb585a484ad9302d4302af
  • Loading branch information
hx235 authored and facebook-github-bot committed May 17, 2022
1 parent f6339de commit 3573558
Show file tree
Hide file tree
Showing 26 changed files with 687 additions and 397 deletions.
1 change: 1 addition & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
* Renamed CompactionFilter::Decision::kRemoveWithSingleDelete to kPurge since the latter sounds more general and hides the implementation details of how compaction iterator handles keys.
* Added ability to specify functions for Prepare and Validate to OptionsTypeInfo. Added methods to OptionTypeInfo to set the functions via an API. These methods are intended for RocksDB plugin developers for configuration management.
* Added a new immutable db options, enforce_single_del_contracts. If set to false (default is true), compaction will NOT fail due to a single delete followed by a delete for the same key. The purpose of this temporay option is to help existing use cases migrate.
* Introduce `BlockBasedTableOptions::cache_usage_options` and use that to replace `BlockBasedTableOptions::reserve_table_builder_memory` and `BlockBasedTableOptions::reserve_table_reader_memory`.

### Bug Fixes
* RocksDB calls FileSystem::Poll API during FilePrefetchBuffer destruction which impacts performance as it waits for read requets completion which is not needed anymore. Calling FileSystem::AbortIO to abort those requests instead fixes that performance issue.
Expand Down
212 changes: 79 additions & 133 deletions db/db_bloom_filter_test.cc

Large diffs are not rendered by default.

15 changes: 12 additions & 3 deletions db/db_sst_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1448,7 +1448,9 @@ TEST_F(DBSSTTest, OpenDBWithInfiniteMaxOpenFiles) {
}

TEST_F(DBSSTTest, OpenDBWithInfiniteMaxOpenFilesSubjectToMemoryLimit) {
for (bool reserve_table_builder_memory : {true, false}) {
for (CacheEntryRoleOptions::Decision charge_table_reader :
{CacheEntryRoleOptions::Decision::kEnabled,
CacheEntryRoleOptions::Decision::kDisabled}) {
// Open DB with infinite max open files
// - First iteration use 1 thread to open files
// - Second iteration use 5 threads to open files
Expand Down Expand Up @@ -1488,7 +1490,9 @@ TEST_F(DBSSTTest, OpenDBWithInfiniteMaxOpenFilesSubjectToMemoryLimit) {
}
Close();

table_options.reserve_table_reader_memory = reserve_table_builder_memory;
table_options.cache_usage_options.options_overrides.insert(
{CacheEntryRole::kBlockBasedTableReader,
{/*.charged = */ charge_table_reader}});
table_options.block_cache =
NewLRUCache(1024 /* capacity */, 0 /* num_shard_bits */,
true /* strict_capacity_limit */);
Expand All @@ -1497,8 +1501,13 @@ TEST_F(DBSSTTest, OpenDBWithInfiniteMaxOpenFilesSubjectToMemoryLimit) {
// Reopening the DB will try to load all existing files, conditionally
// subject to memory limit
Status s = TryReopen(options);
if (table_options.reserve_table_reader_memory) {

if (charge_table_reader == CacheEntryRoleOptions::Decision::kEnabled) {
EXPECT_TRUE(s.IsMemoryLimit());
EXPECT_TRUE(s.ToString().find(
kCacheEntryRoleToCamelString[static_cast<std::uint32_t>(
CacheEntryRole::kBlockBasedTableReader)]) !=
std::string::npos);
EXPECT_TRUE(s.ToString().find("memory limit based on cache capacity") !=
std::string::npos);

Expand Down
58 changes: 58 additions & 0 deletions db/db_test_util.cc
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

#include "db/db_test_util.h"

#include "cache/cache_reservation_manager.h"
#include "db/forward_iterator.h"
#include "env/mock_env.h"
#include "port/lang.h"
Expand Down Expand Up @@ -1683,4 +1684,61 @@ void VerifySstUniqueIds(const TablePropertiesCollection& props) {
}
}

template <CacheEntryRole R>
TargetCacheChargeTrackingCache<R>::TargetCacheChargeTrackingCache(
std::shared_ptr<Cache> target)
: CacheWrapper(std::move(target)),
cur_cache_charge_(0),
cache_charge_peak_(0),
cache_charge_increment_(0),
last_peak_tracked_(false),
cache_charge_increments_sum_(0) {}

template <CacheEntryRole R>
Status TargetCacheChargeTrackingCache<R>::Insert(
const Slice& key, void* value, size_t charge,
void (*deleter)(const Slice& key, void* value), Handle** handle,
Priority priority) {
Status s = target_->Insert(key, value, charge, deleter, handle, priority);
if (deleter == kNoopDeleter) {
if (last_peak_tracked_) {
cache_charge_peak_ = 0;
cache_charge_increment_ = 0;
last_peak_tracked_ = false;
}
if (s.ok()) {
cur_cache_charge_ += charge;
}
cache_charge_peak_ = std::max(cache_charge_peak_, cur_cache_charge_);
cache_charge_increment_ += charge;
}

return s;
}

template <CacheEntryRole R>
bool TargetCacheChargeTrackingCache<R>::Release(Handle* handle,
bool erase_if_last_ref) {
auto deleter = GetDeleter(handle);
if (deleter == kNoopDeleter) {
if (!last_peak_tracked_) {
cache_charge_peaks_.push_back(cache_charge_peak_);
cache_charge_increments_sum_ += cache_charge_increment_;
last_peak_tracked_ = true;
}
cur_cache_charge_ -= GetCharge(handle);
}
bool is_successful = target_->Release(handle, erase_if_last_ref);
return is_successful;
}

template <CacheEntryRole R>
const Cache::DeleterFn TargetCacheChargeTrackingCache<R>::kNoopDeleter =
CacheReservationManagerImpl<R>::TEST_GetNoopDeleterForRole();

template class TargetCacheChargeTrackingCache<
CacheEntryRole::kFilterConstruction>;
template class TargetCacheChargeTrackingCache<
CacheEntryRole::kBlockBasedTableReader>;

} // namespace ROCKSDB_NAMESPACE
45 changes: 45 additions & 0 deletions db/db_test_util.h
Original file line number Diff line number Diff line change
Expand Up @@ -952,6 +952,51 @@ class CacheWrapper : public Cache {
std::shared_ptr<Cache> target_;
};

/*
* A cache wrapper that tracks certain CacheEntryRole's cache charge, its
* peaks and increments
*
* p0
* / \ p1
* / \ /\
* / \/ \
* a / b \
* peaks = {p0, p1}
* increments = {p1-a, p2-b}
*/
template <CacheEntryRole R>
class TargetCacheChargeTrackingCache : public CacheWrapper {
public:
explicit TargetCacheChargeTrackingCache(std::shared_ptr<Cache> target);

using Cache::Insert;
Status Insert(const Slice& key, void* value, size_t charge,
void (*deleter)(const Slice& key, void* value),
Handle** handle = nullptr,
Priority priority = Priority::LOW) override;

using Cache::Release;
bool Release(Handle* handle, bool erase_if_last_ref = false) override;

std::size_t GetCacheCharge() { return cur_cache_charge_; }

std::deque<std::size_t> GetChargedCachePeaks() { return cache_charge_peaks_; }

std::size_t GetChargedCacheIncrementSum() {
return cache_charge_increments_sum_;
}

private:
static const Cache::DeleterFn kNoopDeleter;

std::size_t cur_cache_charge_;
std::size_t cache_charge_peak_;
std::size_t cache_charge_increment_;
bool last_peak_tracked_;
std::deque<std::size_t> cache_charge_peaks_;
std::size_t cache_charge_increments_sum_;
};

class DBTestBase : public testing::Test {
public:
// Sequence of option configurations to try
Expand Down
4 changes: 3 additions & 1 deletion db_stress_tool/db_stress_common.h
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,9 @@ DECLARE_int32(set_in_place_one_in);
DECLARE_int64(cache_size);
DECLARE_int32(cache_numshardbits);
DECLARE_bool(cache_index_and_filter_blocks);
DECLARE_bool(reserve_table_reader_memory);
DECLARE_bool(charge_compression_dictionary_building_buffer);
DECLARE_bool(charge_filter_construction);
DECLARE_bool(charge_table_reader);
DECLARE_int32(top_level_index_pinning);
DECLARE_int32(partition_pinning);
DECLARE_int32(unpartitioned_pinning);
Expand Down
18 changes: 14 additions & 4 deletions db_stress_tool/db_stress_gflags.cc
Original file line number Diff line number Diff line change
Expand Up @@ -304,10 +304,20 @@ DEFINE_int32(cache_numshardbits, 6,
DEFINE_bool(cache_index_and_filter_blocks, false,
"True if indexes/filters should be cached in block cache.");

DEFINE_bool(reserve_table_reader_memory, false,
"A dynamically updating charge to block cache, loosely based on "
"the actual memory usage of table reader, will occur to account "
"the memory, if block cache available.");
DEFINE_bool(charge_compression_dictionary_building_buffer, false,
"Setting for "
"CacheEntryRoleOptions::charged of"
"CacheEntryRole::kCompressionDictionaryBuildingBuffer");

DEFINE_bool(charge_filter_construction, false,
"Setting for "
"CacheEntryRoleOptions::charged of"
"CacheEntryRole::kFilterConstruction");

DEFINE_bool(charge_table_reader, false,
"Setting for "
"CacheEntryRoleOptions::charged of"
"CacheEntryRole::kBlockBasedTableReader");

DEFINE_int32(
top_level_index_pinning,
Expand Down
17 changes: 15 additions & 2 deletions db_stress_tool/db_stress_test_base.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2339,8 +2339,21 @@ void StressTest::Open(SharedState* shared) {
block_based_options.block_cache_compressed = compressed_cache_;
block_based_options.checksum = checksum_type_e;
block_based_options.block_size = FLAGS_block_size;
block_based_options.reserve_table_reader_memory =
FLAGS_reserve_table_reader_memory;
block_based_options.cache_usage_options.options_overrides.insert(
{CacheEntryRole::kCompressionDictionaryBuildingBuffer,
{/*.charged = */ FLAGS_charge_compression_dictionary_building_buffer
? CacheEntryRoleOptions::Decision::kEnabled
: CacheEntryRoleOptions::Decision::kDisabled}});
block_based_options.cache_usage_options.options_overrides.insert(
{CacheEntryRole::kFilterConstruction,
{/*.charged = */ FLAGS_charge_filter_construction
? CacheEntryRoleOptions::Decision::kEnabled
: CacheEntryRoleOptions::Decision::kDisabled}});
block_based_options.cache_usage_options.options_overrides.insert(
{CacheEntryRole::kBlockBasedTableReader,
{/*.charged = */ FLAGS_charge_table_reader
? CacheEntryRoleOptions::Decision::kEnabled
: CacheEntryRoleOptions::Decision::kDisabled}});
block_based_options.format_version =
static_cast<uint32_t>(FLAGS_format_version);
block_based_options.index_block_restart_interval =
Expand Down
10 changes: 5 additions & 5 deletions include/rocksdb/cache.h
Original file line number Diff line number Diff line change
Expand Up @@ -559,15 +559,15 @@ enum class CacheEntryRole {
kIndexBlock,
// Other kinds of block-based table block
kOtherBlock,
// WriteBufferManager reservations to account for memtable usage
// WriteBufferManager's charge to account for its memtable usage
kWriteBuffer,
// BlockBasedTableBuilder reservations to account for
// compression dictionary building buffer's memory usage
// Compression dictionary building buffer's charge to account for
// its memory usage
kCompressionDictionaryBuildingBuffer,
// Filter reservations to account for
// Filter's charge to account for
// (new) bloom and ribbon filter construction's memory usage
kFilterConstruction,
// BlockBasedTableReader reservations to account for
// BlockBasedTableReader's charge to account for
// its memory usage
kBlockBasedTableReader,
// Default bucket, for miscellaneous cache entries. Do not use for
Expand Down
2 changes: 1 addition & 1 deletion include/rocksdb/options.h
Original file line number Diff line number Diff line change
Expand Up @@ -552,7 +552,7 @@ struct DBOptions {
// compaction. For universal-style compaction, you can usually set it to -1.
//
// A high value or -1 for this option can cause high memory usage.
// See BlockBasedTableOptions::reserve_table_reader_memory to constrain
// See BlockBasedTableOptions::cache_usage_options to constrain
// memory usage in case of block based table format.
//
// Default: -1
Expand Down
Loading

0 comments on commit 3573558

Please sign in to comment.