-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
blake2-rfc is slightly faster than the portable implementation #7
Comments
I thought it might be because |
When I try it on ARM I get the opposite result. Should look at 32-bit ARM at some point. |
@oconnor663 i got same performance (vs # code copy from https://github.com/shadowsocks/crypto2/tree/dev/src/hash/blake2b
git clone https://github.com/LuoZijun/test_blake2b/
cargo bench |
@oconnor663 I tried with ARM Neoverse N1, And compiled it with: RUSTFLAGS="-C target-cpu=native -C codegen-units=1" cargo build --release |
https://github.com/cesarb/blake2-rfc
I measure it to be about 2% faster than
portable.rs
. Not yet sure why, though it might be using some SIMD under the covers, or maybe getting optimized to SSE2 by the compiler.However, the relationship is reversed if I set
RUSTFLAGS="-C target-cpu=native -C target-feature=-avx2"
. No idea why. Again, still a small difference. Notably, both implementations tank their performance if I allow them to use AVX2.The text was updated successfully, but these errors were encountered: