-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track the total number of compaction input iterators #13320
base: main
Are you sure you want to change the base?
Track the total number of compaction input iterators #13320
Conversation
5e67c21
to
6a71c06
Compare
@@ -2963,6 +2966,10 @@ class DBImpl : public DB { | |||
// stores the number of compactions are currently running | |||
int num_running_compactions_; | |||
|
|||
// stores the number of input iterators required for currently running | |||
// compactions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint64_t
or size_t
makes more sense to me since num_running_compaction_input_iterators_
should not be negative. However, I wanted to follow the convention here since everything else is using int
@@ -2963,6 +2966,10 @@ class DBImpl : public DB { | |||
// stores the number of compactions are currently running | |||
int num_running_compactions_; | |||
|
|||
// stores the number of input iterators required for currently running | |||
// compactions | |||
int num_running_compaction_input_iterators_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see we have InstrumentedMutexLock l(&mutex_);
protecting num_running_compactions_
as well as num_running_flushes_
.
I guess that means we do not need std::atomic<int>
for num_running_compaction_input_iterators_
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@@ -957,6 +957,13 @@ class DBImpl : public DB { | |||
return num_running_compactions_; | |||
} | |||
|
|||
// Returns the number of input iterators for currently running compactions. | |||
// REQUIREMENT: mutex_ must be held when calling this function. | |||
int num_running_compaction_input_iterators() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed num_running_compactions
did not follow the example of num_running_flushes
, where db->num_running_flushes()
gets called inside HandleNumRunningFlushes
. HandleNumRunningCompactions
was just accessing num_running_compactions_
directly.
@archang19 has updated the pull request. You must reimport the pull request before landing. |
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@cbi42 I added you since this touches compaction code. Perhaps you know a better way of doing this or even another way to obtain the same information |
@@ -61,6 +61,18 @@ bool DBImpl::EnoughRoomForCompaction( | |||
return enough_room; | |||
} | |||
|
|||
size_t DBImpl::GetNumberCompactionInputIterators(Compaction* c) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check for and skip trivial moves and deletion compactions since they won't have any input iterators
@@ -1261,7 +1269,13 @@ bool InternalStats::HandleCompactionPending(uint64_t* value, DBImpl* /*db*/, | |||
|
|||
bool InternalStats::HandleNumRunningCompactions(uint64_t* value, DBImpl* db, | |||
Version* /*version*/) { | |||
*value = db->num_running_compactions_; | |||
*value = db->num_running_compactions(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't num_running_compactions()
assert that DB mutex is held?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it does. I think (?) that is what we want. num_running_flushes()
also checks mutex_.AssertHeld();
and is called from HandleNumRunningFlushes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I saw your other comment. I will update this PR to just read off the values.
|
||
bool InternalStats::HandleNumRunningCompactionInputIterators( | ||
uint64_t* value, DBImpl* db, Version* /*version*/) { | ||
*value = db->num_running_compaction_input_iterators(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not holding the mutex here. Since these are stats, it should be ok to directly read the counter even if its not 100% accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I also get rid of num_running_flushes()
, which asserts the mutex is held?
Status BackgroundCompaction(bool* madeProgress, JobContext* job_context, | ||
LogBuffer* log_buffer, | ||
Status BackgroundCompaction(bool* madeProgress, | ||
int& num_compaction_iterators_added, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C++ style guide suggests that output parameters go at the end
Summary
This PR adds a new statistic to track the total number of input iterators for running compactions.
Context: I am currently working on a separate project, where I am trying to tune the read request sizes made by
FilePrefetchBuffer
to the storage backend. In this particular case,FilePrefetchBuffer
will issue larger reads and have to buffer larger read responses. This means we expect to see higher memory utilization. At least for the initial rollout, we only want to enable this optimization for compaction reads.I want some way to get a sense of what the memory usage impact will be if the prefetch read request size is increased from (for instance) 8MB to 64MB.
If I know the number of files that compactions are actively reading from (i.e. the number of input iterators), I can determine how much the memory usage will increase if I bump up the readahead size inside
FilePrefetchBuffer
.Alternatives considered:
Test Plan
I updated one unit test to confirm that
num_running_compaction_input_iterators
starts and ends at 0 (all the additions and subtractions cancel out). I added plenty ofassert
s to make sure that my new statistic was in the expected state. When I addedfprintf
manually, I confirmed that my statistics updating code was being exercised numerous times insidedb_compaction_test
.We will also monitor the generated statistics after this PR is merged.
We also have the crash tests which will be able to detect if any of my
assert
s fail.