-
Notifications
You must be signed in to change notification settings - Fork 696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRC removal during diskless full sync with TLS enabled. #1479
base: unstable
Are you sure you want to change the base?
Conversation
@talxsha before I look into this, lets put some details in the top comment. linking the issue is not what we susually do. |
src/replication.c
Outdated
@@ -1244,11 +1244,12 @@ void syncCommand(client *c) { | |||
* the primary can accurately lists replicas and their listening ports in the | |||
* INFO output. | |||
* | |||
* - capa <eof|psync2|dual-channel> | |||
* - capa <eof|psync2|dual-channel|disable_sync_crc> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not super fond of the disable_sync_crc capability name. maybe a better name woulod be bypass_crc?
src/config.c
Outdated
@@ -3156,6 +3156,7 @@ static int applyClientMaxMemoryUsage(const char **err) { | |||
standardConfig static_configs[] = { | |||
/* Bool configs */ | |||
createBoolConfig("rdbchecksum", NULL, IMMUTABLE_CONFIG, server.rdb_checksum, 1, NULL, NULL), | |||
createBoolConfig("disable-sync-crc", NULL, MODIFIABLE_CONFIG, server.disable_sync_crc, 0, NULL, NULL), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we normally do not like to introduce new configurations. In this case the feature is controlled via capability so no issues with compatibility. Is there a way this would still be required in some cases>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please don't introduce a config for this.
src/server.c
Outdated
@@ -2218,6 +2218,7 @@ void initServerConfig(void) { | |||
server.fsynced_reploff_pending = 0; | |||
server.rdb_client_id = -1; | |||
server.loading_process_events_interval_ms = LOADING_PROCESS_EVENTS_INTERVAL_DEFAULT; | |||
server.repl_meet_disable_crc_cond = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not follow on why we need to keep this flag on the server? aren't all the checks in readSyncBulkPayload valid at the point of decision to flag the rdb?
src/server.h
Outdated
@@ -1838,6 +1842,7 @@ struct valkeyServer { | |||
double stat_fork_rate; /* Fork rate in GB/sec. */ | |||
long long stat_total_forks; /* Total count of fork. */ | |||
long long stat_rejected_conn; /* Clients rejected because of maxclients */ | |||
size_t stat_total_crc_disabled_syncs_stated; /* Total number of full syncs stated with CRC checksum disabled */ // AMZN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the AMZN, we are not in Kansas anymore. Also the stat name is weird, maybe we can use another stat eg, sync_bypass_crc? .
Also note that In general I would not find any reason to have this statistic unless it is used for writing tests right? Maybe such stats are better be placed under the debug section of the info, but I guess we already have so many stats so i would let it pass.
src/rdb.c
Outdated
@@ -3601,6 +3612,12 @@ int rdbSaveToReplicasSockets(int req, rdbSaveInfo *rsi) { | |||
} | |||
serverSetCpuAffinity(server.bgsave_cpulist); | |||
|
|||
if (disable_sync_crc_capa == 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (disable_sync_crc_capa == 1) { | |
if (disable_sync_crc_capa) { |
src/rdb.c
Outdated
@@ -3354,7 +3355,7 @@ int rdbLoadRioWithLoadingCtx(rio *rdb, int rdbflags, rdbSaveInfo *rsi, rdbLoadin | |||
if (rioRead(rdb, &cksum, 8) == 0) goto eoferr; | |||
if (server.rdb_checksum && !server.skip_checksum_validation) { | |||
memrev64ifbe(&cksum); | |||
if (cksum == 0) { | |||
if (cksum == 0 || (rdb->flags & RIO_FLAG_DISABLE_CRC) != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (cksum == 0 || (rdb->flags & RIO_FLAG_DISABLE_CRC) != 0) { | |
if (cksum == 0 || (rdb->flags & RIO_FLAG_DISABLE_CRC)) { |
Also add justification for why we should do this only when TLS is enabled. Given that the network has built in checksumming, I'm still not convinced about the tradeoff we are making given that the steady state replication is not checksummed. |
@madolson should I tag it as a major-decision ? I think it worth discussion. |
src/replication.c
Outdated
// Set a flag to determin later whether or not the replica will skip CRC calculations for this sync - | ||
// Disable CRC on replica if: (1) TLS is enabled; (2) replica disable_sync_crc is enabled; (3) diskelss sync enabled on both replica and primary. | ||
// Otherwise, CRC should be enabled/disabled as per server.rdb_checksum | ||
if (connIsTLS(conn) && server.disable_sync_crc && use_diskless_load && usemark) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a. I think we should encapsulate this whole condition in some function to make the code more readable.
b. The connIsTLS part of the decision is somewhat too intrusive IMO. I think maybe we can add an API in the connection abstraction like connIntegrityChecked or something like this. maybe there will be non-TLS connections (eg QUIC) which will provide some integrity mechanism which will not be defined as "TLS"
For now it's not. It's just an internal one. I would probably just ping PingXie directly and core team if anyone else is interested. |
Signed-off-by: Tal Shachar <[email protected]>
…o bypass_crc, encapsulated condition checks for skipping CRC, and chenged connIsTLS condition to connIntegrityChecked in ConnectionType. Some changes in the test as well Signed-off-by: Tal Shachar <[email protected]>
Signed-off-by: talxsha <[email protected]>
src/connection.h
Outdated
@@ -121,6 +121,9 @@ typedef struct ConnectionType { | |||
|
|||
/* TLS specified methods */ | |||
sds (*get_peer_cert)(struct connection *conn); | |||
|
|||
/* Miselenious */ | |||
int (*connIntegrityChecked)(void); // return 1 iff connection type has built-in integrity checks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int (*connIntegrityChecked)(void); // return 1 iff connection type has built-in integrity checks | |
int (*connIntegrityChecked)(void); // return 1 if connection type has built-in integrity checks |
src/server.h
Outdated
@@ -1988,6 +1993,7 @@ struct valkeyServer { | |||
char *rdb_filename; /* Name of RDB file */ | |||
int rdb_compression; /* Use compression in RDB? */ | |||
int rdb_checksum; /* Use RDB checksum? */ | |||
int bypass_crc; /* Skip RDB checksum? Applicable only for TLS enabled diskless full sync */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure why do we need to keep this server flag?
src/replication.c
Outdated
@@ -1972,6 +1974,11 @@ static int useDisklessLoad(void) { | |||
return enabled; | |||
} | |||
|
|||
/* Returns 1 if the replica can skip CRC calculations during full sync */ | |||
int replicationBypassCRC(connection *conn, int is_replica_diskless_load, int is_primary_diskless_sync) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we change it to be more explicit name? eg replicationSupportBypassCRC
src/replication.c
Outdated
@@ -2251,6 +2259,7 @@ void readSyncBulkPayload(connection *conn) { | |||
|
|||
serverLog(LL_NOTICE, "PRIMARY <-> REPLICA sync: Loading DB in memory"); | |||
startLoading(server.repl_transfer_size, RDBFLAGS_REPLICATION, asyncLoading); | |||
if (replicationBypassCRC(conn, use_diskless_load, usemark)) rdb.flags |= RIO_FLAG_BYPASS_CRC; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets document some explaining why we think it is O.K to take the decision based on the usemark and connection support.
Signed-off-by: ranshid <[email protected]>
Signed-off-by: ranshid <[email protected]>
Signed-off-by: ranshid <[email protected]>
Signed-off-by: ranshid <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@madolson the code seems fine now. would you like to take a quick look as well?
Signed-off-by: Ran Shidlansik <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks okay, mostly some nits to improve clarity.
@@ -1186,6 +1190,9 @@ static ConnectionType CT_TLS = { | |||
|
|||
/* TLS specified methods */ | |||
.get_peer_cert = connTLSGetPeerCert, | |||
|
|||
/* Miselenious */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* Miselenious */ | |
/* Miscellaneous */ |
@@ -207,6 +207,9 @@ static ConnectionType CT_Unix = { | |||
.process_pending_data = NULL, | |||
.postpone_update_state = NULL, | |||
.update_state = NULL, | |||
|
|||
/* Miselenious */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* Miselenious */ | |
/* Miscellaneous */ |
@@ -5879,6 +5880,7 @@ sds genValkeyInfoString(dict *section_dict, int all_sections, int everything) { | |||
"instantaneous_input_repl_kbps:%.2f\r\n", (float)getInstantaneousMetric(STATS_METRIC_NET_INPUT_REPLICATION) / 1024, | |||
"instantaneous_output_repl_kbps:%.2f\r\n", (float)getInstantaneousMetric(STATS_METRIC_NET_OUTPUT_REPLICATION) / 1024, | |||
"rejected_connections:%lld\r\n", server.stat_rejected_conn, | |||
"total_sync_bypass_crc:%ld\r\n", server.stat_total_sync_bypass_crc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"total_sync_bypass_crc:%ld\r\n", server.stat_total_sync_bypass_crc, | |
"total_sync_bypass_crc:%ld\r\n", server.stat_total_sync_bypass_crc, |
What is the use of this metric? What are end users supposed to do with this information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This metric is merely for testing purposes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally prefer using log lines for testing purposes, since these info fields are part of our public facing contract and we can't change them.
@@ -447,6 +447,9 @@ static ConnectionType CT_Socket = { | |||
.process_pending_data = NULL, | |||
.postpone_update_state = NULL, | |||
.update_state = NULL, | |||
|
|||
/* Miselenious */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* Miselenious */ | |
/* Miscellaneous */ |
@@ -3511,11 +3527,19 @@ void syncWithPrimary(connection *conn) { | |||
* | |||
* EOF: supports EOF-style RDB transfer for diskless replication. | |||
* PSYNC2: supports PSYNC v2, so understands +CONTINUE <new repl ID>. | |||
* BYPASS-CRC: supports skipping CRC calculations during full sync. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BYPASS isn't the right word, since it implies we are doing something else instead. I think SKIP-CRC
is what you want.
set replica_bypassing_crc_count [string match {*total_sync_bypass_crc:1*} [$replica info stats]] | ||
set stats [regexp -inline {total_sync_bypass_crc:(\d+)} [$primary info stats]] | ||
set primary_bypass_crc_count [lindex $stats 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set replica_bypassing_crc_count [string match {*total_sync_bypass_crc:1*} [$replica info stats]] | |
set stats [regexp -inline {total_sync_bypass_crc:(\d+)} [$primary info stats]] | |
set primary_bypass_crc_count [lindex $stats 1] | |
set replica_bypassing_crc_count [s 0 total_sync_bypass_crc] | |
set primary_bypass_crc_count [s 1 total_sync_bypass_crc] |
@@ -0,0 +1,37 @@ | |||
start_server {tags {"repl tls"} overrides {save {}}} { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only need to run these tests with TLS enabled, otherwise this condition is always false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Shouldn't we have tests to check that without TLS we do not skip CRC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just have one test for that though? do we really need the 6 that we have?
/* We can bypass CRC checks when data is transmitted through a verified stream. | ||
* The usemark flag indicates that the primary is streaming the data directly without | ||
* writing it to storage. | ||
* Similarly, the use_diskless_load flag indicates that the | ||
* replica will load the payload directly into memory without first writing it to disk. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is documenting the internal behavior of the function, which means if the function changes this comment is unlikely to get updated. I would move the comment to replicationSupportBypassCRC
.
@@ -3614,6 +3624,12 @@ int rdbSaveToReplicasSockets(int req, rdbSaveInfo *rsi) { | |||
} | |||
serverSetCpuAffinity(server.bgsave_cpulist); | |||
|
|||
if (bypass_crc) { | |||
serverLog(LL_NOTICE, "CRC checksum is disabled for this RDB transfer"); | |||
// mark rdb object to skip CRC checksum calculations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// mark rdb object to skip CRC checksum calculations |
This comment is just describing the next line, which seems very self explanatory to me.
@@ -2494,6 +2509,7 @@ char *sendCommand(connection *conn, ...) { | |||
while (1) { | |||
arg = va_arg(ap, char *); | |||
if (arg == NULL) break; | |||
if (strcmp(arg, "") == 0) continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things things:
- Empty strings can be valid arguments, this might break some obscure functionality like if someone set a password like an empty string. (I don't think that is allowed today, but would like to be more resistant to weird edge cases. You can introduce an arbitrary static string that points to nothing to have the same behavior.
- If you continue with this path, you should use strncmp() where possible. You can actually just do
if (arg[0] == '\0')
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO using some const indicator is not the correct way. Can we just use SendCommandArgv?
Implemented a mechanism to eliminate CRC64 checksumming during full sync when not writing to disk (using a connection that has data integrity checks such as TLS), as it adds overhead with minimal benefit.
Nodes can skip CRC calculations when these conditions are met:
Closes #1129