-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load-time block format validation #59
Load-time block format validation #59
Conversation
Ready for review now. This includes 2 commits from #57, the The HAMT has (almost) zero chill now about malformed blocks. If a block smells bad then it's rejected, no pretending. One item that could be checked but is not because it's so complicated and costly (for now) is to make sure that entries are in their right positions in the graph given their keys. If that's important it could be added later and called on demand. For now this is block-local only. |
ab2f022
to
165d09c
Compare
if !((isLink && !isBucket) || (!isLink && isBucket)) { | ||
return nil, ErrMalformedHamt | ||
} | ||
if isLink && ch.Link.Type() != cid.DagCBOR { // not dag-cbor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly the most controversial item in here that I want to highlight. We're validating that all links to child HAMT nodes (not links it might contain as values, only HAMT nodes) are of codec DAG-CBOR. Just another check to get some assurance we're not being fed bogus data in the structure, it doesn't go very far but we're asserting that this HAMT has homogeneity in its block generation, for now. This could be changed in future if we allowed some variability in codec or wanted to have a migration path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could... make this another parameter? I'd agree, anyway, that having the interior tree all be one codec is probably pretty reasonable.
The repo name doesn't have "cbor" in it; that's my only reason for pause here in making it hardcoded to that value.
(Otoh, there's... several other details and import statements that seem to imply this system is already cbor-only, and the UnmarshalCBOR methods, and... hrm. Maybe the repo name should just also indicate cbor-specificness, if it's already quietly the case.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a reasonable check for now, but I agree we should, ideally, make it more generic.
I mean, ideally we'd save the codec/hash-function, etc. when loading a node, then restore it when encoding. Then, we'd just check "do all linked children match their parents?". But at the moment, you're right. Everything is very DagCBOR specific.
8b2afd8
to
04503fd
Compare
Fixed CI bug with Go 1.11 and squashed this into just the two commits on the HEAD of this branch. Review should focus on those two, 04503fd particularly (the other just adds a |
04503fd
to
570d059
Compare
if !((isLink && !isBucket) || (!isLink && isBucket)) { | ||
return nil, ErrMalformedHamt | ||
} | ||
if isLink && ch.Link.Type() != cid.DagCBOR { // not dag-cbor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a reasonable check for now, but I agree we should, ideally, make it more generic.
I mean, ideally we'd save the codec/hash-function, etc. when loading a node, then restore it when encoding. Then, we'd just check "do all linked children match their parents?". But at the moment, you're right. Everything is very DagCBOR specific.
return nil, ErrMalformedHamt | ||
} | ||
for i := 1; i < len(ch.KVs); i++ { | ||
if bytes.Compare(ch.KVs[i-1].Key, ch.KVs[i].Key) >= 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Followup: we should probably be checking the hash prefix.
Includes #57, the first two commits. The changes so far are only in
LoadNode
and in the tests.Adds in some simple helper functions into the tests to help manually build CBOR blobs that'll load and trigger various cases. Only got as far as ensuring that the bitmap and the number of elements aren't out of alignment. That can get stricter with #54 if we put in precisely the size of bitmap that we need for the array of elements.
Other things I'm working on in here are listed at the bottom of hamt_test.go. The aim is that this thing should refuse to load from blocks that smell funny (i.e. aren't exactly the right form). There should only be one way of representing this data and there should be no avenues for variation. That's an ideal and not strictly possible in the extreme sense for various reasons but we can get close.