Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(crypto): crc32 algorithm fix #2819

Merged
merged 4 commits into from
Dec 7, 2023
Merged

Conversation

dariaterekhova-actionengine
Copy link
Collaborator

During zip functionality development I found out that our crc32 algorithm returns wrong values. So I made a fix and checked on a several files that hashes now are exactly the same as in bash zip utility.

@ibgreen
Copy link
Collaborator

ibgreen commented Dec 4, 2023

Any references that you can link to in code? More test cases?

@@ -34,13 +41,17 @@ const TEST_CASES = [
title: 'binary data (repeated)',
data: repeatedData,
digests: {
sha256: 'SnGMX2AgkPh21d2sxow8phQa8lh8rjf2Vc7GFCIwj2g='
// 'bSCTuOJei5XsmAnqtmm2Aw/2EvUHldNdAxYb3mjSK9s=',
sha256: 'bSCTuOJei5XsmAnqtmm2Aw/2EvUHldNdAxYb3mjSK9s='
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we changing sha256 test results in a PR that makes changes to crc32?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked it with shasum bash utility and the one that had been commented is the correct one for this data. I found out that some of our crypto classes gives the correct result only once after the initialisation. But this test had been used the same class multiple times, so it has given the wrong result on second use and after. So probably somebody wanted just to make this test work...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a huge bug that we must fix first. We shouldn't work around it!

#2824

}
}
];

const HASHES = [new CRC32Hash(), new CRC32CHash(), new MD5Hash(), new SHA256Hash({modules})];
const HASHES = {
crc32: () => new CRC32Hash(),

This comment was marked as outdated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basic Idea of this change is to create a new object from class every time we want to use the algorithm, not to reuse the old one, because I found out that some classes calculate the right result only once after initialisation. And the reason why I decided not to use the name inside the class is because in this case I need to initialise all of them just in order to get the one I need. But if you think that it's the right way - no problem, I can turn it back

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, fixing in #2824

@@ -24,6 +25,7 @@ export default class CRC32 {
}

finalize() {
// performing final division and making crc unsigned.
this.crc = (this.crc ^ -1) >>> 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I don't like with this is that use of the >>> operator is very rare, I am only vaguely familiar with what it does and I don't understand why this is a fix. because of this, it will be hard to maintain.

My interpretation (possibly incorrect) is that shifting 0 bits is essentially a no-op and just ensures that the input is a positive integer. But the old code did calculate an absolute value with Math.abs(), so that was not enough, so what else does this operator do?

Could we find out and be more explicit and say Math.abs(Math.???(...))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference there that 1 in bite representation is 00000000000000000000000000000001 and -1 is 11111111111111111111111111111111. So Math.abs() completely changes the bits there, but what we need is just make JS count first 1 bit as a part of a number and not the sign marker. We don't need to change anything else in this bit representation. And that's exactly what >>> does.

We can also implement it like this:

const toUnsignedInt32 = (n: number): number => {
  if (n >= 0) {
    return n;
  }
  return 0xFFFFFFFF - (n * -1) + 1;
}

We perform this operation once per hash generation, so it shouldn't affect performance

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, that makes sense, negative numbers are in two's complement. If you could add this comment above the >>>.

The difference there that 1 in bit representation is 00000000000000000000000000000001 and -1 is 11111111111111111111111111111111. So Math.abs() completely changes the bits there, but what we need is just make JS count first 1 bit as a part of a number and not the sign marker. We don't need to change anything else in this bit representation. And that's exactly what >>> does.

Copy link
Collaborator

@ibgreen ibgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dariaterekhova-actionengine I landed the hash reuse fixes. If you rebase on master and remove any code that was working around that issue, this should be good to land.

Copy link
Collaborator

@ibgreen ibgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for good teamwork!

@dariaterekhova-actionengine dariaterekhova-actionengine merged commit 08eddf1 into master Dec 7, 2023
3 checks passed
@dariaterekhova-actionengine dariaterekhova-actionengine deleted the dt/crc32_fix branch December 7, 2023 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants