Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NEW] Multi-database support in cluster mode - Implementation Plan #1681

Closed
xbasel opened this issue Feb 6, 2025 · 1 comment
Closed

[NEW] Multi-database support in cluster mode - Implementation Plan #1681

xbasel opened this issue Feb 6, 2025 · 1 comment
Assignees

Comments

@xbasel
Copy link
Member

xbasel commented Feb 6, 2025

Description of the feature

Introduce multi-database support in Valkey cluster mode without breaking existing behavior. Commands like SELECT, SWAPDB, MOVE, and COPY should be valid in clustered deployments, removing the single-database constraint and matching standalone workflows.

Motivation / use-case

  • Standalone-to-Cluster migration: Eliminates the need to rewrite multi-DB logic when switching from standalone to cluster mode.
  • Consistent Workflows: Aligns the standalone and cluster usage models, so users can maintain identical commands and patterns across deployment types.

Key Features

  1. Database-Agnostic Hashing:

    • Keys map to the same slot across all databases (identical to the existing behavior).
      • For example, key x will map to the same slot, regardless of the database it is stored in.
    • Slot space unchanged. Slot distribution remains consistent, ensuring compatibility with existing setups.
  2. Backward Compatibility:

No API changes: Existing cluster commands and client integrations work unchanged. Single-database setups are unaffected.

  1. Cluster Management:

    • Most cluster management commands (e.g., CLUSTER SLOTS, CLUSTER NODES) remain global. They do not run in a selected database context
    • GETKEYSINSLOT, COUNTKEYSINSLOT and MIGRATE operate on the selected database context.
  2. Slot Migration:

    • No workflow changes for clusters running on a single database (DB0).
    • For clusters with multi-databases, iterating over the databases is needed to migrate all keys in all databases.
    • Valkey-cli resharding should be updated to handle multi-DB clusters.
  3. Memory Optimization:

    • Current Valkey implementation pre-allocates databases structures. In cluster mode, for each database, 16k slots are allocated. This introduces memory overhead and regression.
    • Lazy database initialization (via #1609) minimizes memory overhead for unused databases.

Implementation Details

Data structures:
The existing array of databases (server.db) will not change. For every DB entry (database), there is an internal array of slot-based hashtables, as shown in the diagram below. Each hashtable represents one of the 16K cluster slots, so every DB ultimately contains 16K hashtables.

Image

Hashing - database agnostic:
The slot calculation remains unchanged: a key’s hash slot is always determined by the same hashing procedure used in single-DB cluster mode. Now, instead of only referencing server.db[0].slots[...], the logic accesses server.db[N].slots[...] for the currently selected database N.

Command-Level Changes

  1. Cluster management commands remain global and operate in a global context rather than a specific database context. The exceptions are COUNTKEYSINSLOT and GETKEYSINSLOT, which retrieve or count keys from the slot belonging to the currently selected database instead of DB0.
  2. The MIGRATE command now runs in the context of the selected database. The destination-db parameter, currently used in standalone setups and always set to 0 in cluster mode, will now indicate which database the keys are transferred to on the target. This also enables cross-database transfers in cluster mode.
  3. SELECT / SWAPDB / MOVE / COPY will be modified to support cluster mode.

Replication:
Nothing special here, all databases will be replicated the same way they are replication in standalone setups.

Usage
There are no changes for customers using only DB0. Migrating keys/slots from one node to another is done like it is done today:

Source: CLUSTER SETSLOT <slot> MIGRATING <TARGET>
Target: CLUSTER SETSLOT <slot> IMPORTING <SOURCE>

Source: MIGRATE host port "" 0 <keys returned by GETKEYSINSLOT>

Source: CLUSTER SETSLOT <slot> node <TARGET>
Target: CLUSTER SETSLOT <slot> node <TARGET>

If multi databases are used, then the migrate keys/slots will be done as follows:

Source: CLUSTER SETSLOT <slot> MIGRATING <TARGET>
Target: CLUSTER SETSLOT <slot> IMPORTING <SOURCE>

Source: SELECT 0
Source: MIGRATE host port "" 0 <keys returned by GETKEYSINSLOT>

Source: SELECT 1
Source: MIGRATE host port "" 1 <keys returned by GETKEYSINSLOT>
.
.
.
Source: SELECT 15
Source: MIGRATE host port "" 15 <keys returned by GETKEYSINSLOT>

Source: CLUSTER SETSLOT <slot> node <TARGET>
Target: CLUSTER SETSLOT <slot> node <TARGET>

Alternatives considered

We've considered database-aware hashing, for example, expanding slot space where each database gets its slot space:
Expanded Slot Space (16k per DB)
Hash: Each DB has its own dedicated slot range (e.g., DB0 → 0–16383, DB1 → 16384–32767, etc.).

In theory it could provide better data isolation and perhaps easier migration and management (ie. isolate specific databases in specific shards), however, it is not backward compatible, it's a no-go.

PR: #1671
Original issue: #1319

@xbasel xbasel self-assigned this Feb 6, 2025
@xbasel xbasel changed the title [NEW] Multi-database support in cluster mode [NEW] Multi-database support in cluster mode - Implementation Plan Feb 6, 2025
@xbasel xbasel closed this as completed Feb 6, 2025
@PingXie
Copy link
Member

PingXie commented Feb 8, 2025

Great write up, @xbasel!

GETKEYSINSLOT, COUNTKEYSINSLOT and MIGRATE operate on the selected database context.

I think this would be a breaking change to the existing slot migration protocol. I also think this breaks the atomicity property of the atomic slot migration feature (#23). Have you considered hiding away the database concept in the context of slot migration? For instance, how about having "COUNTKEYSINSLOT" enumerate over all databases instead? Similarly, "CLUSTER MIGRATE " should take care all databases at once.

@murphyjacob4 @enjoy-binbin FYI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants