Allow multi-slot MGET in Cluster Mode #707

JohnSully · 2024-06-27T21:13:41Z

The current restrictions for MGET in cluster mode is that all keys must reside in the same slot. This results in very high inefficiency for high batch use cases as the batch has to be cracked into individual GETs. Because there are 16k slots its very unlikely we will find a pair we can send with MGET making it nearly useless in cluster mode.

Instead we can relax the condition a bit and permit MGET if all slots reside on this node and none are migrating. This allows us to serve the request in the common case. In the case where a slot was migrated or changed we still send CROSSSLOT and the client will know to break the batch down to individual GETs.

Because we expect there to be cases where we still send CROSSSLOT simply doing an MGET test on the client is not sufficient to communicate this new support. To help this along we introduce a new INFO field called "features". This is intended to work similar to the features flag in lscpu where new features get added over time. Now the client can check this and determine if it still needs to breakup batches or if they can be sent in one go.

Note: tests will come in a little bit but we can start the conversation of the feature now.

Old Behaviour:

127.0.0.1:30001> get a
(error) MOVED 15495 127.0.0.1:30003
127.0.0.1:30001> get b
(nil)
127.0.0.1:30001> get c
(error) MOVED 7365 127.0.0.1:30002
127.0.0.1:30001> get d
(error) MOVED 11298 127.0.0.1:30003
127.0.0.1:30001> get e
(error) MOVED 15363 127.0.0.1:30003
127.0.0.1:30001> get f
(nil)
127.0.0.1:30001> mget b f
(error) CROSSSLOT Keys in request don't hash to the same slot

New Behavior:

127.0.0.1:30001> get a
(error) MOVED 15495 127.0.0.1:30003
127.0.0.1:30001> get b
(nil)
127.0.0.1:30001> get c
(error) MOVED 7365 127.0.0.1:30002
127.0.0.1:30001> get d
(error) MOVED 11298 127.0.0.1:30003
127.0.0.1:30001> get e
(error) MOVED 15363 127.0.0.1:30003
127.0.0.1:30001> get f
(nil)
127.0.0.1:30001> mget b f
1) (nil)
2) (nil)

JohnSully · 2024-06-27T21:20:26Z

We fail the test:

client can't subscribe to multiple shard channels across different slots in same call

Since the intent is to allow crossslot I need to investigate if pubsub has a special restriction we need to continue to hold

madolson · 2024-06-27T21:21:31Z

For reference, this was discussed in #507.

To recap my perspective. I was okay with allowing commands that aren't read-modify-write commands. Either pure writes (DEL, MSET) or pure reads (MGET) I was okay with operating across slots.

JohnSully · 2024-06-27T21:22:36Z

We have use cases that we simply can't run without this so pretty important for very high scale jobs. Will dive into the pubsub issue but I can't think of a reason it needs to restrict this.

madolson · 2024-06-27T21:23:46Z

We have use cases that we simply can't run without this so pretty important for very high scale jobs.

One question we did want to understand was the actual performance benefit of this. In the other thread we discussed whether or you could just do a deep pipeline of GET operations.

JohnSully · 2024-06-27T21:23:52Z

What was the rationale to not permit read modify write?

JohnSully · 2024-06-27T21:24:39Z

One question we did want to understand was the actual performance benefit of this. In the other thread we discussed whether or you could just do a deep pipeline of GET operations

It has been a year since we did this but its an order of magnitude difference in performance. I can rerun some benchmarks with memtier to get modern data.

madolson · 2024-06-27T21:26:38Z

What was the rationale to not permit read modify write?

It felt like it was logically breaking the abstraction, since we practically have serializability within a slot. To allow atomically doing a read, modifying it, and then writing it a different slot felt like something you would have to carefully architect on the client side to make sure the slots were on the same node. Possible, but not something we wanted to encourage end users to do.

JohnSully · 2024-06-27T21:35:13Z

Ah you are looking at it from the perspective of greater order atomics. This is not designed to allow that and it doesn't since at any time you can get a CROSSSLOT the client must be prepared to break the operation down into lower order operations. It does require more intelligence from the client but the performance wins are worth it.

Re performance here are some examples with valkey-benchmark for mset vs set:

SET: 143266.47 requests per second, p50=0.175 msec
MSET (10 keys): 156494.52 requests per second, p50=0.175 msec

You can see that MSET and SET do the same QPS but MSET is doing 10x the number of keys per second. We use MGET more than MSET but MGET isn't in valkey's benchmark util for some reason. Regardless we've proven this out in production over the last year and its super efficient, in fact the biggest issue is when you try to add shards the MGET batch sizes shrink so you lose efficency which offsets the scale. However this can be worked around by adding more replicas instead of full shards.

JohnSully · 2024-06-27T21:58:20Z

Pipeline of 10 it closes up a bit but still a win:

SET: 1136363.62 requests per second, p50=0.375 msec        
MSET (10 keys): 440528.62 requests per second, p50=1.127 msec

1,136,363 keys/sec for SET vs 4,405,280 for MSET so still 4x faster.

EDIT: For good measure here is pipeline 100

SET: 1886792.50 requests per second, p50=2.375 msec       
MSET (10 keys): 502512.56 requests per second, p50=6.559 msec

so 1,886,792 keys/sec for SET vs 5,025,120 keys/sec for MSET. So 2.66x faster at the extremes.

madolson · 2024-06-27T22:36:24Z

so 1,886,792 keys/sec for SET vs 5,025,120 keys/sec for MSET. So 2.66x faster at the extremes.

Another thing that was discussed was that we could use instruction prefetching or alter the way MGET works to parallelize the fetching of the data out of the dictionary, since a big chunk of the time is spent on TLB and L3 cache misses. We only get that with MGET.

JohnSully · 2024-06-27T22:43:31Z

My experiments with prefetch were mostly failures but its always something that rolls around in the back of your head.

I'm going to clean this one up since its really a huge win, then if you do anything to make non cluster MGET better, the cluster version will improve as well.

Also do take a look at the features field of info as that's a fairly decent change for Redis but I think will be more necessary now that theres so many variants out there.

PingXie · 2024-06-28T05:28:50Z

Pipeline of 10 it closes up a bit but still a win:
SET: 1136363.62 requests per second, p50=0.375 msec        
MSET (10 keys): 440528.62 requests per second, p50=1.127 msec                    
1,136,363 keys/sec for SET vs 4,405,280 for MSET so still 4x faster.

EDIT: For good measure here is pipeline 100
SET: 1886792.50 requests per second, p50=2.375 msec       
MSET (10 keys): 502512.56 requests per second, p50=6.559 msec                    
so 1,886,792 keys/sec for SET vs 5,025,120 keys/sec for MSET. So 2.66x faster at the extremes.

Thanks a lot for sharing the performance data, @JohnSully. This answers one of my earlier questions on "why not pipelining". I wonder if part of the performance improvement comes from the lower RESP protocol parsing overhead?

I'm going to clean this one up since its really a huge win,

I am looking forward to the new update. The way it is coded now however is too specific/narrow to be accepted into the mainline and it changes the contract (whether it is for better or not is a different question). We discussed a similar situation where externally visible behavior would be changed by a proposal and you can find the core team's thoughts here. As a result of that discussion, there is now a new CLIENT CAPA mechanism for opting in non-standard behaviors.

zuiderkwast

@JohnSully Did you verify that you got the correct result from a cross-slot MGET?

In cluster mode, we store the keys in one hashtable per slot. We use server.current_client->slot to avoid computing the slot again when looking up each key in the right hashtable. See getKeySlot().

I think it should be client opt-in, a CLIENT CAPA as Ping suggested.

If a client receives -CROSSSLOT because one of the slots is just being migrated, then if the client splits up the command into multiple ones, the atomicity guarantee of the command is lost. Therefore, a client should make it clear to the user of the client that this is a non-atomic command or feature.

With the client opt-in, I don't think the INFO field is necessary. (If we do add it, then I think it should be under '# Cluster' rather than under '# Server'.)

JohnSully · 2024-06-28T14:31:48Z

@PingXie re changing the contract it is only additive. So it should not be a back compatibility problem. I.e. cases that use to be errors are now accepted but no case that did not error will send an error now.

This is also a major place where Redis is deficient against databases like Aerospike so it would be really limiting to make it gated behind a config.

JohnSully · 2024-06-28T14:42:31Z

@zuiderkwast The code is correct in KeyDB but i will check if anything has changed enough to break it.

However we check hash slots for both the first and all later keys ensuring none are migrating.

As mentioned earlier this is purely for performance it is not intended to increase atomicity gurantees and the client must be prepared for slots to migrate away.

The reason for the feature flag is there is a forward compatibility issue. You cannot naively try the batch on a non supporting server because the performance hit of 2Xing traffic (first the failure then the GET retry). Customers aren't going to want to update their client and see twice the traffic.

So if you don't provide a flag clients will have a hard time implementing support. In our environment we had exactly this problem as not all clusters could be upgraded right away so we ran with a mix for a period.

JohnSully · 2024-06-28T16:12:08Z

@zuiderkwast I will modify the change to only permit crossslot on reads. This should eliminate the issue of needing current_slot since you only need to update the counts on write.

JohnSully · 2024-06-28T16:29:17Z

I pushed the following changes (which should also unblock the tests)

Prevent multi-slot entirely for writes, @madolson this should by proxy meet your read-modify-write requirement as well as the bug with current_slot since its not needed for reads
Prevent multi-slot with transactions since we don't want anyone relying on the higher order behavior as its not going to be guranteed to be available
Prevent with pub/sub entirely as pubsub will not benefit

codecov · 2024-06-28T16:42:12Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.86%. Comparing base (d72a97e) to head (897d36e).

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable     #707      +/-   ##
============================================
- Coverage     70.99%   70.86%   -0.13%     
============================================
  Files           121      121              
  Lines         65175    65184       +9     
============================================
- Hits          46269    46192      -77     
- Misses        18906    18992      +86

Files with missing lines	Coverage Δ
src/cluster.c	`89.37% <100.00%> (+0.13%)`	⬆️
src/server.c	`87.61% <ø> (ø)`

... and 10 files with indirect coverage changes

zuiderkwast · 2024-06-28T17:04:40Z

The code is correct in KeyDB but i will check if anything has changed enough to break it.

I'm pretty sure KeyDB doesn't have the recent refactoring to storing the keys in one dict per slot. This is not even released yet. :) This refactoring avoids the need to keep a separate key-to-slot mapping, which takes extra memory per key.

If you write a simple test case where you read from two different slots, you will notice that you get NULL for the second slot, because it looks for the key in the wrong dict. (A simple fix for that would be to set set *hashslot = -1 for multi-slot commands, which will bypass the optimization in getKeySlot(), but you will notice that once you see the problem and start debugging.)

The reason for the feature flag is there is a forward compatibility issue. You cannot naively try the batch on a non supporting server because the performance hit of 2Xing traffic (first the failure then the GET retry). Customers aren't going to want to update their client and see twice the traffic.

Right, we need some way for clients to find out. The thing is, calling INFO on connect is not something clients are encouraged to do, even though your client does that.

The clients can see the version in the HELLO response, or we can add more fields in the HELLO response, alternatively another command to opt in that clients can call after connecting, just we have for example CLIENT TRACKING to opt-in to client-side caching.

I will modify the change to only permit crossslot on reads. This should eliminate the issue of needing current_slot since you only need to update the counts on write.

I don't think we need to limit MSET actually. The problem is not about counts.

JohnSully · 2024-06-28T18:24:11Z

If you write a simple test case where you read from two different slots, you will notice that you get NULL for the second slot, because it looks for the key in the wrong dict. (A simple fix for that would be to set set *hashslot = -1 for multi-slot commands, which will bypass the optimization in getKeySlot(), but you will notice that once you see the problem and start debugging.)

Ah ok as mentioned tests still pending. So this should not be merged before those. But I wanted to start the conversation earlier. I will write the tests and look for issues integrating with your new feature.

JohnSully · 2024-06-28T18:26:36Z

In the meantime we should sort out the following open questions:

Should this be gated behind a default off setting (I vote NO because its purely additive and really important for competitive reasons). I'm OK with a default ON setting if that is desired but I don't really understand why you would want it off.
Are there any other cases where we don't want to support this. Another option I suppose is to gate it only to MGET but it seems a bit unfortunate other batch commands would get blocked.
Should we switch to the CAPA functionality instead of the feature flags in INFO, my read here is this won't actually help since the client is the one that has to make the decision to use this or not, not us. But I do need to learn more about this new feature you added.

PingXie · 2024-06-28T20:05:45Z

I'm pretty sure KeyDB doesn't have the recent refactoring to storing the keys in one dict per slot.

@JohnSully were your numbers collected on KeyDB or Valkey unstable? Since the key argument here is performance I am also interested in a rerun after we merge the async IO work.

zuiderkwast · 2024-06-28T21:54:01Z

These questions have already a discussion started in the issue #507 so we can already see some peoples ideas there.

Should this be gated behind a default off setting (I vote NO because its purely additive and really important for competitive reasons).

I vote NO to a config, because enable per client is better than globally. The client needs to handle this special logic, so I prefer client opt-in rather than enabled by default. If it's enabled for everyone always, then clients and applications start assuming things and suddenly it's not possible to scale and migrate slots. I'm not sure we want to go down that path.

Are there any other cases where we don't want to support this. Another option I suppose is to gate it only to MGET but it seems a bit unfortunate other batch commands would get blocked.

In the issue #507 we discussed three categories of commands:

Read commands (MGET, SUNION)
Write commands (MSET, DEL)
Read-modify-write (SUNIONSTORE, transactions with reads and writes, Lua scripts)

@madolson argued that it should be allowed for 1-2 but not for 3. IIUC it's because those commands aren't simply a batch of jobs and can't be used for fan-in fan-out. I tend to agree, but OTOH I'm not sure we need to restrict that. It's cleaner, simpler and more predictable to have no exception to this rule. When cross-slot commands are allowed, then they're all allowed. So I vote for no exceptions. (There may be use cases like finding a lot of sets using SCAN and then aggregating them using SUNIONSTORE, or other yet unknown use cases.)

Worth noting: There's already a hack that allows cross-slot commands. It's called Lua scripts without declared keys.

Should we switch to the CAPA functionality instead of the feature flags in INFO, my read here is this won't actually help since the client is the one that has to make the decision to use this or not, not us. But I do need to learn more about this new feature you added.

Synopsis: CLIENT CAPA capa1 [capa2...]

The client can announce capabilities it supports. To avoid an error when the client sends an unknown capability that exists in a future valkey version (or a fork of Valkey or something) the server just ignores unknown capabilities and returns OK. Thus, this command doesn't allow the client to check if the server supports a capability. (We can probably change the return value if we want, since it's not released yet.)

Another option is a new command, such as CLIENT CROSSSLOT ALLOW. The client sends it in pipeline with other commands (HELLO, INFO, etc.) when it connects, so there's no extra round trip. If the server doesn't understand it, it returns an error and the client knows.

were your numbers collected on KeyDB or Valkey unstable?

(@PingXie asked)

I want to know this too. If they were done on Valkey with the existing code, then maybe most of the keys were looked up in the wrong dict and got NULL instead of a correct reply. That most likely affects performance.

JohnSully · 2024-06-28T21:56:25Z

I don't understand why the server cares if the client "supports" this or not. Since non-supporting clients already know they can't send cross slot commands nothing changes for them. Its the client that needs to know if the server supports it not the other way around.

What is the bad case scenario for legacy clients here?

Now it may still make sense to use the CAPA handshake for the client to query support for this, and that is fine. But I don't see why the server would change its behavior if a CAPA isn't sent.

zuiderkwast · 2024-06-28T22:03:08Z

Since non-supporting clients already know they can't send cross slot commands nothing changes for them.

Not always. Some clients are dumb and do minimal checks. Some clients just lookup the first key in the command and routes the command based on that.

What is the bad case scenario for legacy clients here?

Unaware application programmers start using Valkey cluster without knowing about slots and the cross slot rule. It seems to work because their keys happen to be in the same node.

JohnSully · 2024-06-28T22:15:50Z

Unaware application programmers start using Valkey cluster without knowing about slots and the cross slot rule. It seems to work because their keys happen to be in the same node.

I can see the argument to make things more explicit upstream, however this will only work 33% of the time or less since minimum clusters have 3 primaries. They would really have to do no validation to not notice. Also clients have to do slot->server mapping anyways so they are probably already going to throw exceptions before it even hits the server.

I also don't think we can assume all clients will implement this as a transparent x-server MGET since the implicit assumption of atomicity for MGET is broken when its sent to different servers. Clients may instead choose to expose this in a different way than a transparent MGET that under the hood cracks the operation into smaller ones. At Snap we did choose the weak consistency option since we have deep control of our use cases and could make that decision.

In the context of the server itself this is all irrelevant since MGETs are atomic at that level so we are not breaking any ABI. It is purely additive at that level and no existing use case will be broken.

zuiderkwast · 2024-06-28T22:21:17Z

OK, now let's wait for others to catch up and share their opinions. :)

Can you tell us more about Aerospike? It seems interesting. I found https://aerospike.com/docs/server/architecture/overview.

Does it support cross-node transactions? If it does, there's no unpredictable cross-slot error to worry about, so that's a bit different situation.

JohnSully · 2024-06-28T22:44:20Z

It has a strict and non strict mode that you can set. For us non strict is the most common.

I think strict mode would be super valuable to some people. But most cache users in my experience can trade strictness for performance so theres lots of value in faster non strict.

PingXie · 2024-06-28T22:53:46Z

@JohnSully, to address your concerns effectively, let's revisit the original problem: you're focused on enhancing GET performance, and KeyDB's cross-slot MGET offers a compelling solution in your use case. As I mentioned earlier, since we've been actively developing comprehensive performance optimizations, and it's important to assess the performance of our 8.0 release candidate for this use case before proceeding further.

IMO #507 covers a broader scope than just MGET, making a per-operation decision on cross-slot functionality less than ideal. A unified strategy for cross-slot operations is necessary, and an RFC could be an excellent starting point to define your desired cross-slot semantics for Valkey.

As for Aerospike, their approach, as I understand it, primarily revolves around a tiered caching mechanism to lower costs, rather than a specific focus on cross-slot MGET functionality but curious to hear your insights.

JohnSully · 2024-06-28T23:46:08Z

I can make an RFC but Snap cannot use Valkey in production without this since it would be a major cost regression and we use batches heavily.

We were hoping this could be in the September release so that we could put Valkey in our road map.

PingXie · 2024-06-29T01:22:02Z

We were hoping this could be in the September release so that we could put Valkey in our road map.

I believe we still have time. Let's redo the performance test on the same hardware spec once we have a release candidate out, which should contain all the major performance improvements we'd like to include. I assume it is acceptable IF Valkey 8.0 (RC) delivers GET performance comparable to that of your current best setup (with cross-slot MGET). In other words, it's not the performance delta between GET and MGET on Valkey 8.0 that matters most, but how the GET performance on Valkey 8.0 stacks up against the (M)GET performance on KeyDB on the same hardware. We can continue brainstorming cross-slot operations too.

JohnSully · 2024-06-29T05:46:22Z

I don't see how you can get over the 4-10x single thread efficiency of batch MGET not to mention the bandwidth improvements, and if you could why wouldn't you want to add it to the performance improvements you are planning?

But it is a long weekend in Canada so we can revisit this more next week.

PingXie · 2024-06-29T22:34:54Z

I don't see how you can get over the 4-10x single thread efficiency of batch MGET

Let's re-establish this on Valkey 8.0 RC first. I fully support performance improvements, but it's important to have a holistic understanding of the change and its impact on the rest of the system. As it stands, the proposed PR does not align with the proposal in #507, which also needs thorough deliberation. Given the scope of Valkey use cases, we cannot afford to implement pointed fixes without comprehensive consideration.

Here is the process that I am thinking of:

Re-evaluate the GET performance of your use case against Valkey 8.0 RC.
If the improvement is significant (4x+), we can continue brainstorming the best way to bring this great performance to Valkey. We can continue here, or preferably via an RFC.
Otherwise, I would suggest continuing the discussion on [Feature-Request]: Cross-Slot Command Execution in ValKey Cluster #507 instead.

Additionally, I would like to understand other ways to help onboard existing KeyDB users with Valkey. Please feel free to file issues for any further support.

But it is a long weekend in Canada so we can revisit this more next week.

SGTM

BTW, there is also the correctness issue that @zuiderkwast mentioned in #707 (comment) (copied below).

If you write a simple test case where you read from two different slots, you will notice that you get NULL for the second slot, because it looks for the key in the wrong dict. (A simple fix for that would be to set set *hashslot = -1 for multi-slot commands, which will bypass the optimization in getKeySlot(), but you will notice that once you see the problem and start debugging.)

asafpamzn · 2024-07-04T12:55:11Z

I'm strongly against this feature. From the PR description the use case is to improve performance.
IMHO, this should be solved at the client side,
If the user wants to have atomic multi GET they should use MGET and verify that all the keys are in the same slot.
If the user wants enhanced performance they should not use transaction or MGET and use a batch API to send a single batch. The client driver will split the batch between the nodes and have small batches.

If the feature has a different purpose and I'm missing something please update the PR description with the motivation.
The batching can be implemented in the client side, we will be happy to get contribution or issue for the https://github.com/aws/glide-for-redis repo and we can support it.

JohnSully · 2024-07-04T18:46:05Z

The issue is that there's a 1/16,000 chance you'll have two keys going to the same slot so for practical purposes you can't really use MGET in cluster mode. Whereas if you are a 3 shard cluster that should be a 1/3 probability. We typically have batch sizes of 200 to 500 keys so generally can get decent batches out to the servers if they would accept them. In this case the client is doing the job of making sure that the batches all are served by the server we send the batch to, its just loosening an arbitrary restriction of making it the same slot.

…

On Thu, Jul 4, 2024 at 8:55 AM asafpamzn ***@***.***> wrote: I'm strongly against this feature. From the PR description the use case is to improve performance. IMHO, this should be solved at the client side, If the user wants to have atomic multi GET they should use MGET and verify that all the keys are in the same slot. If the user wants enhanced performance they should not use transaction or MGET and use a batch API to send a single batch. The client driver will split the batch between the nodes and have small batches. If the feature has a different purpose and I'm missing something please update the PR description with the motivation. The batching can be implemented in the client side, we will be happy to get contribution or issue for the https://github.com/aws/glide-for-redis repo and we can support it. — Reply to this email directly, view it on GitHub <#707 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA5W4ASM2K2NUNHGSPRSQGTZKVA4JAVCNFSM6AAAAABKAV3SIKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMBYHEYTEOJXGQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

zuiderkwast · 2024-07-04T21:55:38Z

@asafpamzn

I'm strongly against this feature.

It helps if you explain why you're against it.

@JohnSully

for practical purposes you can't really use MGET in cluster mode

That depends how you use it. If you're accessing a bunch of keys related to the same user, the same product, etc. then they should be designed with tags to put them in the same slot. Then MGET works for these keys.

Now, what you're doing is different. You're doing batch processing for performance reasons, which is a special optimization that not every simple web app does. If you fix this PR so it actually works correctly, then we can check the performance and see how much better faster than a pipeline it actually is.

asafpamzn · 2024-07-07T11:21:53Z

zuiderkwast

I think that I explained. :) probably was not clear.
It can be solved at the client side using a new batching API. We can have a new batch API, like pipeline in valkey-py with Transaction=false, see https://github.com/valkey-io/valkey-py?tab=readme-ov-file#advanced-topics

It won't guarantee the atomicity but, it will have better performance than your suggestion from the server side

Why?
MGET is atomic, that is, the server has to collect all GET requests at the server before issuing the command, if the MGET size is 1K elements it can be significant.

The new client Batching API will just stream the commands, the server will handle them as they goes. It will allow multi slot operations.

zuiderkwast · 2024-07-07T15:49:04Z

@asafpamzn yes, many clients allow pipelining or async api, where the client can already send one MGET per slot in a single round trip, but @JohnSully claims that a single cross-slot MGET is several times faster than that, so that solution is not satisfying IIUC. I still haven't seen compelling evidence for this though.

asafpamzn · 2024-07-08T09:10:43Z

Thanks for the clarification.

We did some benchmark back then with redis-py see https://aws.amazon.com/blogs/database/optimize-redis-client-performance-for-amazon-elasticache/

I don't know how to link specific section, but, please search for "The following table summarizes the performance of pipelines with a single connection (in RPS).

It shows that in redis-py the Transactions = False flag is faster.
It is not the exact scenario, but, it can give us some intuition. I will be happy to see different results.

Thus, I don't think that we should change the server, but, optimizing the clients.

In addition, due to clients implementation of MGET there is a confusion if MGET is atomic or not in cluster mode, and this change is going to make it more confusing. I saw many users who use MGET for batching.

I think that it should be solved by updating the existing clients APIs to be more clear, what is atomic and what is not.. I would like to see the documentation of MGET following this change and how we are going to explain it to users that are less advanced and don't pin slots to specific nodes. I think that the docs is not going to be easy for the common user.

…n with MGET. Currently we check if all keys are the same slot but we could retunr if all slots are on this node instead. Signed-off-by: John Sully <[email protected]>

Signed-off-by: John Sully <[email protected]>

JohnSully · 2025-01-28T20:13:32Z

Just a quick update I fixed the code to work with the slot cache on the client. We just disable this optimization for crossslot mget but leave it on if the mget is served by a single slot.

Still needs tests but we're at full functionality now.

I rebased it to unstable since this PR is a few months old.

zuiderkwast reviewed Jun 28, 2024

View reviewed changes

JohnSully force-pushed the cluster_mget branch from 50fb65a to a63b137 Compare June 28, 2024 16:27

JohnSully force-pushed the cluster_mget branch from 4afe0ed to 69bc27e Compare June 28, 2024 16:43

JohnSully mentioned this pull request Jun 28, 2024

[Feature-Request]: Cross-Slot Command Execution in ValKey Cluster #507

Open

madolson added the major-decision-pending Major decision pending by TSC team label Jul 1, 2024

JohnSully added 2 commits January 28, 2025 18:59

When in cluster mode we should be more permissive about what we retur…

73b4e66

…n with MGET. Currently we check if all keys are the same slot but we could retunr if all slots are on this node instead. Signed-off-by: John Sully <[email protected]>

Fix formatting

9ec45be

Signed-off-by: John Sully <[email protected]>

JohnSully force-pushed the cluster_mget branch from 69bc27e to 9ec45be Compare January 28, 2025 19:25

Ensure that multi-slot queries work with slot caching

b0dfe05

Improve C compatibility

897d36e

Allow multi-slot MGET in Cluster Mode #707

Are you sure you want to change the base?

Allow multi-slot MGET in Cluster Mode #707

Conversation

JohnSully commented Jun 27, 2024

JohnSully commented Jun 27, 2024

madolson commented Jun 27, 2024 • edited Loading

JohnSully commented Jun 27, 2024

madolson commented Jun 27, 2024

JohnSully commented Jun 27, 2024

JohnSully commented Jun 27, 2024

madolson commented Jun 27, 2024

JohnSully commented Jun 27, 2024 • edited Loading

JohnSully commented Jun 27, 2024 • edited Loading

madolson commented Jun 27, 2024

JohnSully commented Jun 27, 2024

PingXie commented Jun 28, 2024

zuiderkwast left a comment

Choose a reason for hiding this comment

JohnSully commented Jun 28, 2024 • edited Loading

JohnSully commented Jun 28, 2024 • edited Loading

JohnSully commented Jun 28, 2024

JohnSully commented Jun 28, 2024

codecov bot commented Jun 28, 2024 • edited Loading

Codecov Report

zuiderkwast commented Jun 28, 2024 • edited Loading

JohnSully commented Jun 28, 2024 • edited Loading

JohnSully commented Jun 28, 2024 • edited Loading

PingXie commented Jun 28, 2024

zuiderkwast commented Jun 28, 2024

JohnSully commented Jun 28, 2024 • edited Loading

zuiderkwast commented Jun 28, 2024

JohnSully commented Jun 28, 2024

zuiderkwast commented Jun 28, 2024

JohnSully commented Jun 28, 2024

PingXie commented Jun 28, 2024

JohnSully commented Jun 28, 2024 • edited Loading

PingXie commented Jun 29, 2024

JohnSully commented Jun 29, 2024

PingXie commented Jun 29, 2024

asafpamzn commented Jul 4, 2024

JohnSully commented Jul 4, 2024 via email

zuiderkwast commented Jul 4, 2024

asafpamzn commented Jul 7, 2024

zuiderkwast commented Jul 7, 2024

asafpamzn commented Jul 8, 2024

JohnSully commented Jan 28, 2025 • edited Loading

madolson commented Jun 27, 2024 •

edited

Loading

JohnSully commented Jun 27, 2024 •

edited

Loading

JohnSully commented Jun 27, 2024 •

edited

Loading

JohnSully commented Jun 28, 2024 •

edited

Loading

JohnSully commented Jun 28, 2024 •

edited

Loading

codecov bot commented Jun 28, 2024 •

edited

Loading

zuiderkwast commented Jun 28, 2024 •

edited

Loading

JohnSully commented Jun 28, 2024 •

edited

Loading

JohnSully commented Jun 28, 2024 •

edited

Loading

JohnSully commented Jun 28, 2024 •

edited

Loading

JohnSully commented Jun 28, 2024 •

edited

Loading

JohnSully commented Jan 28, 2025 •

edited

Loading