Why do we divide into PrimarySubkeyColumnFamily and MetadataColumnFamily? #2729

123alf · 2025-01-23T02:18:16Z

123alf
Jan 23, 2025

Method 1: Currently, for hash data, we store TTL and other metadata in the MetadataColumnFamily, while the fields are stored in the PrimarySubkeyColumnFamily. The hget operation queries both ColumnFamilies separately.

Method 2: We still store the metadata and the fields in different keys, but within a single ColumnFamily. Since they are stored adjacently, the hget operation will produce fewer I/O operations.

Why did we choose Method 1? I couldn't find any related documentation, and I'm looking forward to your response.

Answered by PragmaTwice

Jan 23, 2025

Hi, thank you for your question.

If we store metadata and subkey in a single column family, SCAN operations will be inefficient (also for SCAN-related operations, e.g. key counting).

However, we are welcome if you have better ideas, or want to propose another encoding.

View full answer

PragmaTwice · 2025-01-23T03:02:54Z

PragmaTwice
Jan 23, 2025
Collaborator

Hi, thank you for your question.

If we store metadata and subkey in a single column family, SCAN operations will be inefficient (also for SCAN-related operations, e.g. key counting).

However, we are welcome if you have better ideas, or want to propose another encoding.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why do we divide into PrimarySubkeyColumnFamily and MetadataColumnFamily? #2729

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Why do we divide into PrimarySubkeyColumnFamily and MetadataColumnFamily? #2729

123alf Jan 23, 2025

Replies: 1 comment

PragmaTwice Jan 23, 2025 Collaborator

123alf
Jan 23, 2025

PragmaTwice
Jan 23, 2025
Collaborator