Allow checkpointing and restoration of hash state #94

nickbabcock · 2025-01-09T18:48:12Z

If one is hashing a stream of data over a long period of time, it becomes conducive to be able to checkpoint the hash state to allow one to recover the state without rehashing the rest of the data.

It also allows great flexibility on how one wants to hash data.

I'm not tied to this exact API. Hard coding the number of bytes is potentially brittle, but it does remove any chance of a fallible write or read during serialization / deserialization. I don't see format changing for some time.

Closes #88

If one is hashing a stream of data over a long period of time, it becomes conduscive to be able to checkpoint the hash state to allow one to recover the state without rehashing the rest of the data. It also allows great flexibility on how one wants to hash data. I'm not tied to this exact API. Hard coding the number of bytes is potentially brittle, but it does remove any chance of a fallible write or read during serialization / deserialization. I don't see format changing for some time.

sticnarf · 2025-01-13T09:50:15Z

Sorry I didn't give my thought about the API design because I wasn't confident about what it should be like, either.

I really appreciate you for implementing the checkpoint feature.

Hard coding the number of bytes doesn't seem a problem to me considering its stability.

In addition to the form of a const-length array, I think Vec<u8> is useful as well. When the checkpoint is serialized or sent through the network, we usually have the length of the checkpoint bytes (e.g. in protobuf bytes). In these cases, we can calculate the real length of the buffer, making the length in the checkpoint bytes redundant. This means the Vec<u8> only needs 128..=160 bytes.

Both representations may be useful. Maybe the checkpoint can be an opaque struct while its implementation is still a 168-byte array. And it can provide functions to convert it into and from [u8; 168] or Vec<u8>.

nickbabcock force-pushed the checkpoint branch 6 times, most recently from 165a679 to a2912e4 Compare January 10, 2025 13:02

nickbabcock force-pushed the checkpoint branch from a2912e4 to 50fe210 Compare January 10, 2025 13:07

nickbabcock merged commit b7e0505 into master Jan 10, 2025
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow checkpointing and restoration of hash state #94

Allow checkpointing and restoration of hash state #94

nickbabcock commented Jan 9, 2025

sticnarf commented Jan 13, 2025

Allow checkpointing and restoration of hash state #94

Allow checkpointing and restoration of hash state #94

Conversation

nickbabcock commented Jan 9, 2025

sticnarf commented Jan 13, 2025