You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Following #1502 which introduced hashtable instead of dict in hash objects we started embedding the field sds in the hashTableEntry.
The problem is that the field sds might arrive from different sources:
command arguments - in the usual case the sds will be provided from the parsed command arguments.. In such cases the sds is from a stringObject which means it will always have a minimal header size of 3 bytes (sds8).
listpack conversion - when we convert from listpack to hashtable the listpack is scanned and the field sds is being created from the listpack string via sdsnewsize which will use the minimal header size (ie small strings will use sds5 which is 1 byte).
Modules - for example VM_HashSet will create a RAW string object which will have the sds allocated with a minimal size header (ie sds5 for small strings which is 1 byte long).
When we create the hashTable entry in hashTypeCreateEntry we will embed the field sds according to the provided sds representation, so in case the field originated at a parsed command argument it will use extra 2 bytes for the header.
While there would probably NOT be any degradation in overall memory utilization (since the new hashtable is more memory efficient) it might cause strange results following listpack conversions.
for example:
say hash1 is created when the hash_max_listpack_entries config is 0 and added with 10 small fields
and hash2 is created when the hash_max_listpack_entries config is 0 9 and added with 10 small fields
after all 10 elements were added both tables are expected to show the same memory consumption, but hash1 would show as using extra 18 bytes of memory.
NOTE - I do think the issue is minor and would probably be addressed during the work on #1551 and/or #640 So I mainly opened it in order to have a better tracking of the issue.
The text was updated successfully, but these errors were encountered:
I started looking at this. My first idea was to convert from sds8 to sds5 in sdscopytobuffer (used when we embed it in another structure), but it may be better to allow sds5 already in EMBSTR-encoded serverObject. Then sdscopytobuffer can just copy the representation as-is.
The only drawback is that sds5 doesn't know its own allocation size. With embedded sds5, I saw some weird output from debug sdslen command that we need to fix. Even for EMBSTR encoded serverObject, we track the usable size of the allocation using the sds header of the embedded sds8 string. With sds5, we'll need to rely on zmalloc_usable_size for the sds5 case, which is probably fine too.
Following #1502 which introduced hashtable instead of dict in hash objects we started embedding the field sds in the hashTableEntry.
The problem is that the field sds might arrive from different sources:
When we create the hashTable entry in hashTypeCreateEntry we will embed the field sds according to the provided sds representation, so in case the field originated at a parsed command argument it will use extra 2 bytes for the header.
While there would probably NOT be any degradation in overall memory utilization (since the new hashtable is more memory efficient) it might cause strange results following listpack conversions.
for example:
say hash1 is created when the hash_max_listpack_entries config is 0 and added with 10 small fields
and hash2 is created when the hash_max_listpack_entries config is 0 9 and added with 10 small fields
after all 10 elements were added both tables are expected to show the same memory consumption, but hash1 would show as using extra 18 bytes of memory.
NOTE - I do think the issue is minor and would probably be addressed during the work on #1551 and/or #640 So I mainly opened it in order to have a better tracking of the issue.
The text was updated successfully, but these errors were encountered: