-
Notifications
You must be signed in to change notification settings - Fork 708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add maxmemory-reserved parameter for evicting key earlier to avoid OOM #831
base: unstable
Are you sure you want to change the base?
Conversation
e31dd8b
to
b8a98df
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #831 +/- ##
============================================
+ Coverage 70.87% 71.03% +0.16%
============================================
Files 121 121
Lines 65203 65282 +79
============================================
+ Hits 46212 46375 +163
+ Misses 18991 18907 -84
|
b8a98df
to
ba13a0c
Compare
ba13a0c
to
e27e9de
Compare
this seem like a interesting feature, did not review, just drop a comment that approve the concept. |
I also like the idea. I just haven't really spent enough time thinking about it. Memory management is a big area in Valkey we need to improve. |
e27e9de
to
0a68013
Compare
0a68013
to
9596c78
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a quick look.
+1 on introducing "back pressure" earlier. I feel that this could be used with `maxmemory_eviction_tenacity" to give an even smoother eviction experience. |
35437eb
to
db966fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hwware! LGTM overall. Can you add some tests too?
Sure, ready to work, Thanks |
f7cc8c8
to
7a5584b
Compare
src/evict.c
Outdated
@@ -398,11 +398,12 @@ int getMaxmemoryState(size_t *total, size_t *logical, size_t *tofree, float *lev | |||
if (total) *total = mem_reported; | |||
|
|||
/* We may return ASAP if there is no need to compute the level. */ | |||
if (!server.maxmemory) { | |||
if (!server.maxmemory_available) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this more, I feel that it is more preferable to model this new setting as a "soft" maxmemory
, which can trigger key eviction earlier but won't cause OOM to be returned before hitting the actual maxmemory
. Otherwise, we effectively create an alias of maxmemory
. More specifically, I think performEviction
should only return EVICT_FAIL
when the memory usage goes beyond the real maxmemory
.
Additionally, since getMaxmemoryState
is also used outside of the OOM prevention path such as in VM_GetUsedMemoryRatio
, we should consider parameterizing maxmemory
and having it passed in by the caller instead, so that we can maintain the same semantics in these externally facing scenarios.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like what you suggested, if we can think when "maxmemory_available" is available, it is a soft maxmemory. Then you maybe think we should not return OOM, in the case, how we can return message to client, any idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right - how about just processing the command normally?
- if
used_memory
is below soft max, no change to the existing logic - if
used_memory
is above soft max but below hard max, trigger key eviction and continue with the normal processing - if
used_memory
is above hard max, no change to the existing logic, i.e., trigger key eviction and fail the command if theused_memory
is still above hard max after the eviction)
200b203
to
4da32a2
Compare
4da32a2
to
510265f
Compare
07b8d9c
to
188f215
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The INFO field mem_not_counted_for_evict
should include the reserved memory, right? In our docs, we say that eviction happens when
used_memory - mem_not_counted_for_evict > maxmemory
so we should make sure this formula is still correct. Docs here: https://valkey.io/topics/lru-cache/
valkey.conf
Outdated
@@ -1279,6 +1279,12 @@ acllog-max-len 128 | |||
# | |||
# maxmemory-policy noeviction | |||
|
|||
# `maxmemory-reserved` defines a fix amount of reserved maxmemory away from maxmemory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Backticks? This is not markdown. I think we can skip the backticks, like how the other configs are described.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice in valkey.conf, if a command is mentioned in the comment, Backticks is used. Although I have no idea how it origins, I just follow this tradition.
# When the difference between memory usage and maxmemory is less than it, Valkey begins proactive key eviction. However, exceeding this | ||
# threshold does not immediately reject new write commands; only the hard `maxmemory` limit will do so. | ||
# | ||
# maxmemory-reserved <bytes> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For most of the configs, there is a real example here, usually with the default value. I think it's better. We can mention in the text above that it's in bytes and that the default is 0 to be extra clear.
# maxmemory-reserved <bytes> | |
# maxmemory-reserved 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is aligned with maxmemory
void calculateKeyEvictionMemory(void) { | ||
server.key_eviction_memory = server.maxmemory - server.maxmemory_reserved; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
server.key_eviction_memory
is redundant information. Maybe we don't need to store it in a global? We can calculate server.maxmemory - server.maxmemory_reserved
every time when we need it, or use a small function for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think so.
In function getMaxmemoryState(), we need the server.key_eviction_memory value multiply times. And performEvictions() always call getMaxmemoryState() to check if key eviction process is needed.
Here, the server.maxmemory and server.maxmemory_reserved are fixed values if nobody update them by config set command.
Thus, I think calculating server.key_eviction_memory once and work as a global variable is better.
I implemented this in my first commit 3b9fb9f, it is a percetage of the maxmemory, well match with |
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: Shivshankar-Reddy <[email protected]> Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
Signed-off-by: hwware <[email protected]>
No, i do not think so. The mem_not_counted_for_evict (aka called by freeMemoryGetNotCountedMemory()) However, the reserved memory is the size memory stepped away from the maxmemory. It should not include the AOF and replica output buffer part. |
5638c62
to
4be3b3c
Compare
So.... instead we should update the docs to explain the formula like this?
|
Reference: #742 and https://learn.microsoft.com/en-us/azure/azure-cache-for-redis/cache-best-practices-memory-management (Azure)
Generally, when clients set maxmemory-policy as allkeys-lru or other memory eviction policies, and maxmemory as well, If server runs as write-heavy workloads, the data stored in memory could reach the maxmemory limit very quickly, then OOM message will be reported.
If we have maxmemory-reserved parameter, we can set amount of memory away from maxmemory, then key eviction process could begin earlier to avoid used memory to catch up the maxmemory before OOM happens. Thus, we can see the benefit is we can delay OOM time.