Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: non simd sha256 for incompatible systems #427

Open
wants to merge 43 commits into
base: master
Choose a base branch
from

Conversation

matthewkeil
Copy link
Member

@matthewkeil matthewkeil commented Dec 18, 2024

Motivation

An issue was posted about systems that do not support simd feature set. Made library compatible with those systems.

See ChainSafe/lodestar#7177 for more information.

Description of Changes

  • as-sha25
    • make common.ts that holds non-simd functionality
    • make simd and non-simd entrance files
    • call common from both entrance files and implement simd only in simd version
    • modify codegen and package scripts to build both versions of the code
    • modify TS wrapper code to load correct version of bindings depending on simd presence
    • modify TS wrapper code to call correct methods depending on presence of simd
    • fix tests to exercise both sets of bindings fully

@twoeths
Copy link
Contributor

twoeths commented Dec 20, 2024

I don't see we're using simd sha256 in lodestar, we haven't enabled batch hash yet

Update: nvm, we haven't used but the wasm code was loaded anyway

@matthewkeil matthewkeil force-pushed the mkeil/non-simd-sha256 branch from f55d080 to bdf755c Compare December 20, 2024 17:45
@matthewkeil matthewkeil force-pushed the mkeil/non-simd-sha256 branch from ddd5c17 to 51b1bdd Compare January 5, 2025 18:32
@matthewkeil matthewkeil marked this pull request as ready for review January 5, 2025 18:55
@matthewkeil matthewkeil requested a review from a team as a code owner January 5, 2025 18:55
Copy link

github-actions bot commented Jan 6, 2025

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 6746e24 Previous: 5868113 Ratio
digestTwoHashObjects 50023 times 48.362 ms/op 48.655 ms/op 0.99
digest2Bytes32 50023 times 54.931 ms/op
digest 50023 times 54.658 ms/op 60.332 ms/op 0.91
input length 32 1.1910 us/op 1.2220 us/op 0.97
input length 64 1.3630 us/op 1.3650 us/op 1.00
input length 128 2.2870 us/op 2.2710 us/op 1.01
input length 256 3.3900 us/op 3.3950 us/op 1.00
input length 512 5.6540 us/op 5.6890 us/op 0.99
input length 1024 10.907 us/op 11.242 us/op 0.97
digest 1000000 times 876.56 ms/op 952.51 ms/op 0.92
hashObjectToByteArray 50023 times 1.2297 ms/op 1.2640 ms/op 0.97
byteArrayToHashObject 50023 times 1.8087 ms/op 1.8137 ms/op 1.00
digest64 200092 times 218.20 ms/op 226.60 ms/op 0.96
hash 200092 times using batchHash4UintArray64s 243.89 ms/op 255.76 ms/op 0.95
digest64HashObjects 200092 times 196.03 ms/op 213.38 ms/op 0.92
hash 200092 times using batchHash4HashObjectInputs 219.26 ms/op 220.93 ms/op 0.99
getGindicesAtDepth 4.1860 us/op 4.3370 us/op 0.97
iterateAtDepth 7.6060 us/op 7.7200 us/op 0.99
getGindexBits 460.00 ns/op 483.00 ns/op 0.95
gindexIterator 1.0230 us/op 1.0490 us/op 0.98
HashComputationLevel.push then loop 27.291 ms/op 30.395 ms/op 0.90
HashComputation[] push then loop 37.853 ms/op 52.092 ms/op 0.73
hash 2 Uint8Array 500000 times - hashtree 218.76 ms/op 229.62 ms/op 0.95
hashTwoObjects 500000 times - hashtree 215.72 ms/op 218.33 ms/op 0.99
executeHashComputations - hashtree 9.5184 ms/op 12.307 ms/op 0.77
hash 2 Uint8Array 500000 times - as-sha256 561.95 ms/op 553.38 ms/op 1.02
hashTwoObjects 500000 times - as-sha256 508.83 ms/op 505.90 ms/op 1.01
executeHashComputations - as-sha256 46.883 ms/op 51.528 ms/op 0.91
hash 2 Uint8Array 500000 times - noble 1.2198 s/op 1.2771 s/op 0.96
hashTwoObjects 500000 times - noble 1.6530 s/op 1.7334 s/op 0.95
executeHashComputations - noble 36.089 ms/op 40.184 ms/op 0.90
getHashComputations 2.9696 ms/op 3.4524 ms/op 0.86
executeHashComputations 11.457 ms/op 12.990 ms/op 0.88
get root 15.681 ms/op 16.069 ms/op 0.98
getNodeH() x7812.5 avg hindex 12.021 us/op 12.049 us/op 1.00
getNodeH() x7812.5 index 0 7.4620 us/op 7.4610 us/op 1.00
getNodeH() x7812.5 index 7 7.4760 us/op 7.4660 us/op 1.00
getNodeH() x7812.5 index 7 with key array 6.3030 us/op 6.2900 us/op 1.00
new LeafNode() x7812.5 293.45 us/op 303.09 us/op 0.97
getHashComputations 250000 nodes 17.146 ms/op 19.675 ms/op 0.87
batchHash 250000 nodes 99.605 ms/op 88.436 ms/op 1.13
get root 250000 nodes 127.50 ms/op 119.31 ms/op 1.07
getHashComputations 500000 nodes 35.802 ms/op 36.148 ms/op 0.99
batchHash 500000 nodes 173.18 ms/op 174.35 ms/op 0.99
get root 500000 nodes 233.03 ms/op 237.91 ms/op 0.98
getHashComputations 1000000 nodes 56.201 ms/op 62.621 ms/op 0.90
batchHash 1000000 nodes 327.44 ms/op 364.75 ms/op 0.90
get root 1000000 nodes 486.53 ms/op 477.87 ms/op 1.02
multiproof - depth 15, 1 requested leaves 8.1490 us/op 8.3740 us/op 0.97
tree offset multiproof - depth 15, 1 requested leaves 17.108 us/op 19.018 us/op 0.90
compact multiproof - depth 15, 1 requested leaves 2.9940 us/op 2.9700 us/op 1.01
multiproof - depth 15, 2 requested leaves 11.617 us/op 11.321 us/op 1.03
tree offset multiproof - depth 15, 2 requested leaves 20.693 us/op 21.368 us/op 0.97
compact multiproof - depth 15, 2 requested leaves 2.9670 us/op 2.9220 us/op 1.02
multiproof - depth 15, 3 requested leaves 16.405 us/op 16.138 us/op 1.02
tree offset multiproof - depth 15, 3 requested leaves 26.610 us/op 26.698 us/op 1.00
compact multiproof - depth 15, 3 requested leaves 3.5320 us/op 3.4630 us/op 1.02
multiproof - depth 15, 4 requested leaves 21.350 us/op 21.364 us/op 1.00
tree offset multiproof - depth 15, 4 requested leaves 32.962 us/op 32.750 us/op 1.01
compact multiproof - depth 15, 4 requested leaves 4.2010 us/op 4.2310 us/op 0.99
packedRootsBytesToLeafNodes bytes 4000 offset 0 5.5090 us/op 5.3920 us/op 1.02
packedRootsBytesToLeafNodes bytes 4000 offset 1 5.4610 us/op 5.4340 us/op 1.00
packedRootsBytesToLeafNodes bytes 4000 offset 2 5.4340 us/op 5.3840 us/op 1.01
packedRootsBytesToLeafNodes bytes 4000 offset 3 5.4280 us/op 5.3750 us/op 1.01
subtreeFillToContents depth 40 count 250000 48.008 ms/op 43.636 ms/op 1.10
setRoot - gindexBitstring 21.596 ms/op 20.541 ms/op 1.05
setRoot - gindex 22.555 ms/op 21.757 ms/op 1.04
getRoot - gindexBitstring 2.6941 ms/op 2.5557 ms/op 1.05
getRoot - gindex 3.3073 ms/op 3.3031 ms/op 1.00
getHashObject then setHashObject 22.960 ms/op 21.845 ms/op 1.05
setNodeWithFn 20.226 ms/op 19.645 ms/op 1.03
getNodeAtDepth depth 0 x100000 280.14 us/op 280.11 us/op 1.00
setNodeAtDepth depth 0 x100000 2.6964 ms/op 2.5693 ms/op 1.05
getNodesAtDepth depth 0 x100000 312.14 us/op 318.24 us/op 0.98
setNodesAtDepth depth 0 x100000 875.76 us/op 760.89 us/op 1.15
getNodeAtDepth depth 1 x100000 346.14 us/op 342.23 us/op 1.01
setNodeAtDepth depth 1 x100000 8.7880 ms/op 8.2221 ms/op 1.07
getNodesAtDepth depth 1 x100000 435.87 us/op 436.16 us/op 1.00
setNodesAtDepth depth 1 x100000 7.6050 ms/op 6.8413 ms/op 1.11
getNodeAtDepth depth 2 x100000 743.13 us/op 729.22 us/op 1.02
setNodeAtDepth depth 2 x100000 16.052 ms/op 16.011 ms/op 1.00
getNodesAtDepth depth 2 x100000 19.518 ms/op 17.912 ms/op 1.09
setNodesAtDepth depth 2 x100000 24.275 ms/op 23.970 ms/op 1.01
tree.getNodesAtDepth - gindexes 9.7385 ms/op 8.6710 ms/op 1.12
tree.getNodesAtDepth - push all nodes 2.4824 ms/op 2.3702 ms/op 1.05
tree.getNodesAtDepth - navigation 311.94 us/op 311.40 us/op 1.00
tree.setNodesAtDepth - indexes 794.05 us/op 696.43 us/op 1.14
set at depth 8 689.00 ns/op 773.00 ns/op 0.89
set at depth 16 1.3450 us/op 1.1290 us/op 1.19
set at depth 32 2.2600 us/op 1.9270 us/op 1.17
iterateNodesAtDepth 8 256 14.868 us/op 13.636 us/op 1.09
getNodesAtDepth 8 256 3.8910 us/op 3.7440 us/op 1.04
iterateNodesAtDepth 16 65536 5.1094 ms/op 4.4599 ms/op 1.15
getNodesAtDepth 16 65536 1.5796 ms/op 2.0359 ms/op 0.78
iterateNodesAtDepth 32 250000 15.885 ms/op 16.449 ms/op 0.97
getNodesAtDepth 32 250000 4.4126 ms/op 4.8085 ms/op 0.92
iterateNodesAtDepth 40 250000 15.756 ms/op 16.330 ms/op 0.96
getNodesAtDepth 40 250000 4.4853 ms/op 4.8508 ms/op 0.92
250000 validators root getter 118.41 ms/op 120.65 ms/op 0.98
250000 validators batchHash() 89.288 ms/op 101.43 ms/op 0.88
250000 validators hashComputations 13.638 ms/op 19.283 ms/op 0.71
bitlist bytes to struct (120,90) 972.00 ns/op 823.00 ns/op 1.18
bitlist bytes to tree (120,90) 3.6140 us/op 2.7050 us/op 1.34
bitlist bytes to struct (2048,2048) 1.0570 us/op 1.0920 us/op 0.97
bitlist bytes to tree (2048,2048) 4.4120 us/op 4.1810 us/op 1.06
ByteListType - deserialize 7.9557 ms/op 7.9517 ms/op 1.00
BasicListType - deserialize 17.900 ms/op 15.349 ms/op 1.17
ByteListType - serialize 7.7343 ms/op 7.8756 ms/op 0.98
BasicListType - serialize 10.882 ms/op 10.046 ms/op 1.08
BasicListType - tree_convertToStruct 30.239 ms/op 24.899 ms/op 1.21
List[uint8, 68719476736] len 300000 ViewDU.getAll() + iterate 5.1745 ms/op 4.8482 ms/op 1.07
List[uint8, 68719476736] len 300000 ViewDU.get(i) 4.2309 ms/op 4.2476 ms/op 1.00
Array.push len 300000 empty Array - number 7.6399 ms/op 6.7251 ms/op 1.14
Array.set len 300000 from new Array - number 2.2631 ms/op 1.9298 ms/op 1.17
Array.set len 300000 - number 6.9970 ms/op 6.0340 ms/op 1.16
Uint8Array.set len 300000 486.49 us/op 499.83 us/op 0.97
Uint32Array.set len 300000 557.57 us/op 587.33 us/op 0.95
Container({a: uint8, b: uint8}) getViewDU x300000 26.117 ms/op 27.840 ms/op 0.94
ContainerNodeStruct({a: uint8, b: uint8}) getViewDU x300000 10.530 ms/op 10.777 ms/op 0.98
List(Container) len 300000 ViewDU.getAllReadonly() + iterate 219.28 ms/op 210.42 ms/op 1.04
List(Container) len 300000 ViewDU.getAllReadonlyValues() + iterate 238.84 ms/op 261.95 ms/op 0.91
List(Container) len 300000 ViewDU.get(i) 6.7044 ms/op 6.7138 ms/op 1.00
List(Container) len 300000 ViewDU.getReadonly(i) 6.5720 ms/op 6.4119 ms/op 1.02
List(ContainerNodeStruct) len 300000 ViewDU.getAllReadonly() + iterate 36.186 ms/op 36.339 ms/op 1.00
List(ContainerNodeStruct) len 300000 ViewDU.getAllReadonlyValues() + iterate 6.4025 ms/op 6.2870 ms/op 1.02
List(ContainerNodeStruct) len 300000 ViewDU.get(i) 6.0891 ms/op 6.1555 ms/op 0.99
List(ContainerNodeStruct) len 300000 ViewDU.getReadonly(i) 6.0225 ms/op 5.9209 ms/op 1.02
Array.push len 300000 empty Array - object 7.4250 ms/op 7.2436 ms/op 1.03
Array.set len 300000 from new Array - object 2.0104 ms/op 2.4450 ms/op 0.82
Array.set len 300000 - object 6.2688 ms/op 7.4349 ms/op 0.84
cachePermanentRootStruct no cache 18.995 us/op 19.946 us/op 0.95
cachePermanentRootStruct with cache 221.00 ns/op 201.00 ns/op 1.10
epochParticipation len 250000 rws 7813 2.3723 ms/op 2.2082 ms/op 1.07
BeaconState ViewDU hashTreeRoot() vc=200000 517.91 ms/op 532.91 ms/op 0.97
BeaconState ViewDU recursive hash - commit step vc=200000 5.5108 ms/op 5.6823 ms/op 0.97
BeaconState ViewDU validator tree creation vc=10000 40.516 ms/op 40.415 ms/op 1.00
BeaconState ViewDU batchHashTreeRoot vc=200000 399.66 ms/op 426.62 ms/op 0.94
BeaconState ViewDU hashTreeRoot - commit step vc=200000 342.84 ms/op 342.80 ms/op 1.00
BeaconState ViewDU hashTreeRoot - hash step vc=200000 73.198 ms/op 70.748 ms/op 1.03
deserialize Attestation - tree 3.8440 us/op 3.6410 us/op 1.06
deserialize Attestation - struct 1.9720 us/op 1.9140 us/op 1.03
deserialize SignedAggregateAndProof - tree 4.9650 us/op 5.0230 us/op 0.99
deserialize SignedAggregateAndProof - struct 3.0990 us/op 3.0330 us/op 1.02
deserialize SyncCommitteeMessage - tree 1.4320 us/op 1.3690 us/op 1.05
deserialize SyncCommitteeMessage - struct 1.1460 us/op 1.1250 us/op 1.02
deserialize SignedContributionAndProof - tree 2.9670 us/op 2.9060 us/op 1.02
deserialize SignedContributionAndProof - struct 2.4180 us/op 2.3510 us/op 1.03
deserialize SignedBeaconBlock - tree 279.16 us/op 290.31 us/op 0.96
deserialize SignedBeaconBlock - struct 123.65 us/op 124.27 us/op 1.00
BeaconState vc 300000 - deserialize tree 659.16 ms/op 635.71 ms/op 1.04
BeaconState vc 300000 - serialize tree 159.61 ms/op 177.07 ms/op 0.90
BeaconState.historicalRoots vc 300000 - deserialize tree 894.00 ns/op 863.00 ns/op 1.04
BeaconState.historicalRoots vc 300000 - serialize tree 656.00 ns/op 663.00 ns/op 0.99
BeaconState.validators vc 300000 - deserialize tree 611.69 ms/op 613.30 ms/op 1.00
BeaconState.validators vc 300000 - serialize tree 117.35 ms/op 113.89 ms/op 1.03
BeaconState.balances vc 300000 - deserialize tree 25.141 ms/op 27.177 ms/op 0.93
BeaconState.balances vc 300000 - serialize tree 4.0258 ms/op 3.9874 ms/op 1.01
BeaconState.previousEpochParticipation vc 300000 - deserialize tree 912.78 us/op 886.19 us/op 1.03
BeaconState.previousEpochParticipation vc 300000 - serialize tree 323.38 us/op 330.15 us/op 0.98
BeaconState.currentEpochParticipation vc 300000 - deserialize tree 903.85 us/op 899.55 us/op 1.00
BeaconState.currentEpochParticipation vc 300000 - serialize tree 323.78 us/op 324.96 us/op 1.00
BeaconState.inactivityScores vc 300000 - deserialize tree 27.240 ms/op 21.141 ms/op 1.29
BeaconState.inactivityScores vc 300000 - serialize tree 4.8651 ms/op 4.1749 ms/op 1.17
hashTreeRoot Attestation - struct 80.739 us/op 71.456 us/op 1.13
hashTreeRoot Attestation - tree 77.379 us/op 71.776 us/op 1.08
hashTreeRoot SignedAggregateAndProof - struct 102.17 us/op 95.606 us/op 1.07
hashTreeRoot SignedAggregateAndProof - tree 109.56 us/op 99.882 us/op 1.10
hashTreeRoot SyncCommitteeMessage - struct 21.913 us/op 20.116 us/op 1.09
hashTreeRoot SyncCommitteeMessage - tree 22.319 us/op 21.953 us/op 1.02
hashTreeRoot SignedContributionAndProof - struct 63.689 us/op 60.466 us/op 1.05
hashTreeRoot SignedContributionAndProof - tree 71.021 us/op 67.736 us/op 1.05
hashTreeRoot SignedBeaconBlock - struct 5.7598 ms/op 5.7542 ms/op 1.00
hashTreeRoot SignedBeaconBlock - tree 6.4521 ms/op 6.2894 ms/op 1.03
hashTreeRoot Validator - struct 27.279 us/op 29.129 us/op 0.94
hashTreeRoot Validator - tree 31.665 us/op 31.444 us/op 1.01
BeaconState vc 300000 - hashTreeRoot tree 11.279 s/op 11.088 s/op 1.02
BeaconState vc 300000 - batchHashTreeRoot tree 6.7385 s/op 6.6171 s/op 1.02
BeaconState.historicalRoots vc 300000 - hashTreeRoot tree 4.6070 us/op 4.1100 us/op 1.12
BeaconState.validators vc 300000 - hashTreeRoot tree 11.456 s/op 11.269 s/op 1.02
BeaconState.balances vc 300000 - hashTreeRoot tree 293.32 ms/op 296.73 ms/op 0.99
BeaconState.previousEpochParticipation vc 300000 - hashTreeRoot tree 36.950 ms/op 35.222 ms/op 1.05
BeaconState.currentEpochParticipation vc 300000 - hashTreeRoot tree 36.174 ms/op 34.264 ms/op 1.06
BeaconState.inactivityScores vc 300000 - hashTreeRoot tree 296.68 ms/op 290.29 ms/op 1.02
hash64 x18 49.500 us/op 48.629 us/op 1.02
hashTwoObjects x18 68.062 us/op 65.000 us/op 1.05
hash64 x1740 4.6902 ms/op 4.6286 ms/op 1.01
hashTwoObjects x1740 6.3688 ms/op 6.3475 ms/op 1.00
hash64 x2700000 7.2784 s/op 7.0859 s/op 1.03
hashTwoObjects x2700000 9.8054 s/op 9.6717 s/op 1.01
get_exitEpoch - ContainerType 318.00 ns/op 317.00 ns/op 1.00
get_exitEpoch - ContainerNodeStructType 284.00 ns/op 278.00 ns/op 1.02
set_exitEpoch - ContainerType 305.00 ns/op 295.00 ns/op 1.03
set_exitEpoch - ContainerNodeStructType 291.00 ns/op 286.00 ns/op 1.02
get_pubkey - ContainerType 1.0330 us/op 1.0180 us/op 1.01
get_pubkey - ContainerNodeStructType 277.00 ns/op 268.00 ns/op 1.03
hashTreeRoot - ContainerType 490.00 ns/op 477.00 ns/op 1.03
hashTreeRoot - ContainerNodeStructType 487.00 ns/op 475.00 ns/op 1.03
createProof - ContainerType 4.8610 us/op 4.3340 us/op 1.12
createProof - ContainerNodeStructType 21.982 us/op 22.010 us/op 1.00
serialize - ContainerType 1.6060 us/op 1.5980 us/op 1.01
serialize - ContainerNodeStructType 1.3500 us/op 1.3600 us/op 0.99
set_exitEpoch_and_hashTreeRoot - ContainerType 12.033 us/op 11.909 us/op 1.01
set_exitEpoch_and_hashTreeRoot - ContainerNodeStructType 31.212 us/op 32.024 us/op 0.97
Array - for of 10.865 us/op 5.9600 us/op 1.82
Array - for(;;) 11.060 us/op 5.9850 us/op 1.85
basicListValue.readonlyValuesArray() 4.0207 ms/op 3.8948 ms/op 1.03
basicListValue.readonlyValuesArray() + loop all 4.1133 ms/op 4.1320 ms/op 1.00
compositeListValue.readonlyValuesArray() 28.257 ms/op 28.337 ms/op 1.00
compositeListValue.readonlyValuesArray() + loop all 30.467 ms/op 26.814 ms/op 1.14
Number64UintType - get balances list 4.5604 ms/op 4.5276 ms/op 1.01
Number64UintType - set balances list 9.9475 ms/op 10.018 ms/op 0.99
Number64UintType - get and increase 10 then set 36.607 ms/op 36.773 ms/op 1.00
Number64UintType - increase 10 using applyDelta 14.416 ms/op 16.021 ms/op 0.90
Number64UintType - increase 10 using applyDeltaInBatch 14.463 ms/op 16.113 ms/op 0.90
tree_newTreeFromUint64Deltas 22.324 ms/op 21.400 ms/op 1.04
unsafeUint8ArrayToTree 37.202 ms/op 37.965 ms/op 0.98
bitLength(50) 246.00 ns/op 234.00 ns/op 1.05
bitLengthStr(50) 243.00 ns/op 223.00 ns/op 1.09
bitLength(8000) 238.00 ns/op 228.00 ns/op 1.04
bitLengthStr(8000) 279.00 ns/op 263.00 ns/op 1.06
bitLength(250000) 236.00 ns/op 232.00 ns/op 1.02
bitLengthStr(250000) 331.00 ns/op 300.00 ns/op 1.10
floor - Math.floor (53) 1.2515 ns/op 1.2439 ns/op 1.01
floor - << 0 (53) 1.2431 ns/op 1.2437 ns/op 1.00
floor - Math.floor (512) 1.2614 ns/op 1.2459 ns/op 1.01
floor - << 0 (512) 1.2427 ns/op 1.2436 ns/op 1.00
fnIf(0) 1.5537 ns/op 1.5549 ns/op 1.00
fnSwitch(0) 2.1815 ns/op 2.1789 ns/op 1.00
fnObj(0) 1.5590 ns/op 1.5552 ns/op 1.00
fnArr(0) 1.5615 ns/op 1.5565 ns/op 1.00
fnIf(4) 2.1742 ns/op 2.1841 ns/op 1.00
fnSwitch(4) 2.1842 ns/op 2.1762 ns/op 1.00
fnObj(4) 1.5615 ns/op 1.5567 ns/op 1.00
fnArr(4) 1.5541 ns/op 1.5554 ns/op 1.00
fnIf(9) 3.1076 ns/op 3.1076 ns/op 1.00
fnSwitch(9) 2.1797 ns/op 2.1772 ns/op 1.00
fnObj(9) 1.5565 ns/op 1.5561 ns/op 1.00
fnArr(9) 1.5612 ns/op 1.5590 ns/op 1.00
Container {a,b,vec} - as struct x100000 125.01 us/op 124.65 us/op 1.00
Container {a,b,vec} - as tree x100000 528.40 us/op 532.25 us/op 0.99
Container {a,vec,b} - as struct x100000 157.70 us/op 155.70 us/op 1.01
Container {a,vec,b} - as tree x100000 497.19 us/op 497.67 us/op 1.00
get 2 props x1000000 - rawObject 314.35 us/op 312.22 us/op 1.01
get 2 props x1000000 - proxy 73.090 ms/op 73.277 ms/op 1.00
get 2 props x1000000 - customObj 313.37 us/op 311.96 us/op 1.00
Simple object binary -> struct 825.00 ns/op 717.00 ns/op 1.15
Simple object binary -> tree_backed 2.2010 us/op 1.9070 us/op 1.15
Simple object struct -> tree_backed 2.5700 us/op 2.4780 us/op 1.04
Simple object tree_backed -> struct 2.1320 us/op 1.8680 us/op 1.14
Simple object struct -> binary 1.1390 us/op 965.00 ns/op 1.18
Simple object tree_backed -> binary 1.8330 us/op 1.5930 us/op 1.15
aggregationBits binary -> struct 616.00 ns/op 651.00 ns/op 0.95
aggregationBits binary -> tree_backed 2.4460 us/op 2.5860 us/op 0.95
aggregationBits struct -> tree_backed 2.8070 us/op 3.0080 us/op 0.93
aggregationBits tree_backed -> struct 1.1190 us/op 1.1980 us/op 0.93
aggregationBits struct -> binary 782.00 ns/op 806.00 ns/op 0.97
aggregationBits tree_backed -> binary 984.00 ns/op 1.0390 us/op 0.95
List(uint8) 100000 binary -> struct 1.4115 ms/op 1.6596 ms/op 0.85
List(uint8) 100000 binary -> tree_backed 264.35 us/op 278.78 us/op 0.95
List(uint8) 100000 struct -> tree_backed 1.4091 ms/op 1.3719 ms/op 1.03
List(uint8) 100000 tree_backed -> struct 1.0709 ms/op 1.2182 ms/op 0.88
List(uint8) 100000 struct -> binary 1.1167 ms/op 1.1028 ms/op 1.01
List(uint8) 100000 tree_backed -> binary 105.77 us/op 108.97 us/op 0.97
List(uint64Number) 100000 binary -> struct 1.3842 ms/op 1.3834 ms/op 1.00
List(uint64Number) 100000 binary -> tree_backed 4.5624 ms/op 4.6133 ms/op 0.99
List(uint64Number) 100000 struct -> tree_backed 6.9680 ms/op 6.1871 ms/op 1.13
List(uint64Number) 100000 tree_backed -> struct 2.4225 ms/op 2.4196 ms/op 1.00
List(uint64Number) 100000 struct -> binary 1.5678 ms/op 1.4933 ms/op 1.05
List(uint64Number) 100000 tree_backed -> binary 1.0033 ms/op 967.37 us/op 1.04
List(Uint64Bigint) 100000 binary -> struct 3.8282 ms/op 3.7332 ms/op 1.03
List(Uint64Bigint) 100000 binary -> tree_backed 4.7627 ms/op 4.5182 ms/op 1.05
List(Uint64Bigint) 100000 struct -> tree_backed 7.2582 ms/op 7.4169 ms/op 0.98
List(Uint64Bigint) 100000 tree_backed -> struct 5.0317 ms/op 4.8968 ms/op 1.03
List(Uint64Bigint) 100000 struct -> binary 2.0639 ms/op 2.0658 ms/op 1.00
List(Uint64Bigint) 100000 tree_backed -> binary 1.2759 ms/op 1.1345 ms/op 1.12
Vector(Root) 100000 binary -> struct 34.597 ms/op 35.240 ms/op 0.98
Vector(Root) 100000 binary -> tree_backed 33.151 ms/op 38.393 ms/op 0.86
Vector(Root) 100000 struct -> tree_backed 45.255 ms/op 47.358 ms/op 0.96
Vector(Root) 100000 tree_backed -> struct 45.858 ms/op 48.068 ms/op 0.95
Vector(Root) 100000 struct -> binary 2.8829 ms/op 2.6666 ms/op 1.08
Vector(Root) 100000 tree_backed -> binary 7.3904 ms/op 6.2453 ms/op 1.18
List(Validator) 100000 binary -> struct 101.23 ms/op 98.818 ms/op 1.02
List(Validator) 100000 binary -> tree_backed 353.43 ms/op 313.77 ms/op 1.13
List(Validator) 100000 struct -> tree_backed 376.29 ms/op 353.56 ms/op 1.06
List(Validator) 100000 tree_backed -> struct 199.42 ms/op 195.03 ms/op 1.02
List(Validator) 100000 struct -> binary 29.938 ms/op 29.173 ms/op 1.03
List(Validator) 100000 tree_backed -> binary 94.427 ms/op 98.886 ms/op 0.95
List(Validator-NS) 100000 binary -> struct 114.45 ms/op 104.79 ms/op 1.09
List(Validator-NS) 100000 binary -> tree_backed 165.59 ms/op 161.51 ms/op 1.03
List(Validator-NS) 100000 struct -> tree_backed 204.59 ms/op 202.47 ms/op 1.01
List(Validator-NS) 100000 tree_backed -> struct 170.12 ms/op 169.91 ms/op 1.00
List(Validator-NS) 100000 struct -> binary 30.057 ms/op 29.279 ms/op 1.03
List(Validator-NS) 100000 tree_backed -> binary 35.580 ms/op 34.115 ms/op 1.04
get epochStatuses - MutableVector 100.76 us/op 117.63 us/op 0.86
get epochStatuses - ViewDU 173.79 us/op 209.15 us/op 0.83
set epochStatuses - ListTreeView 2.0878 ms/op 2.2574 ms/op 0.92
set epochStatuses - ListTreeView - set() 432.97 us/op 465.44 us/op 0.93
set epochStatuses - ListTreeView - commit() 806.29 us/op 770.34 us/op 1.05
bitstring 521.67 ns/op 518.88 ns/op 1.01
bit mask 13.669 ns/op 13.625 ns/op 1.00
struct - increase slot to 1000000 933.04 us/op 933.72 us/op 1.00
UintNumberType - increase slot to 1000000 26.817 ms/op 27.411 ms/op 0.98
UintBigintType - increase slot to 1000000 164.39 ms/op 172.25 ms/op 0.95
UintBigint8 x 100000 tree_deserialize 5.0301 ms/op 5.0606 ms/op 0.99
UintBigint8 x 100000 tree_serialize 626.51 us/op 646.77 us/op 0.97
UintBigint16 x 100000 tree_deserialize 4.8180 ms/op 5.7977 ms/op 0.83
UintBigint16 x 100000 tree_serialize 1.3607 ms/op 1.4088 ms/op 0.97
UintBigint32 x 100000 tree_deserialize 5.5314 ms/op 5.6522 ms/op 0.98
UintBigint32 x 100000 tree_serialize 1.6488 ms/op 1.7226 ms/op 0.96
UintBigint64 x 100000 tree_deserialize 6.3884 ms/op 6.2177 ms/op 1.03
UintBigint64 x 100000 tree_serialize 1.7171 ms/op 1.7364 ms/op 0.99
UintBigint8 x 100000 value_deserialize 438.16 us/op 439.86 us/op 1.00
UintBigint8 x 100000 value_serialize 752.57 us/op 795.15 us/op 0.95
UintBigint16 x 100000 value_deserialize 467.52 us/op 469.84 us/op 1.00
UintBigint16 x 100000 value_serialize 793.25 us/op 846.14 us/op 0.94
UintBigint32 x 100000 value_deserialize 497.71 us/op 504.85 us/op 0.99
UintBigint32 x 100000 value_serialize 841.38 us/op 879.62 us/op 0.96
UintBigint64 x 100000 value_deserialize 562.46 us/op 562.53 us/op 1.00
UintBigint64 x 100000 value_serialize 1.0396 ms/op 1.0869 ms/op 0.96
UintBigint8 x 100000 deserialize 3.4351 ms/op 3.3929 ms/op 1.01
UintBigint8 x 100000 serialize 1.6062 ms/op 1.6080 ms/op 1.00
UintBigint16 x 100000 deserialize 3.2951 ms/op 3.5103 ms/op 0.94
UintBigint16 x 100000 serialize 1.6033 ms/op 1.6711 ms/op 0.96
UintBigint32 x 100000 deserialize 3.2371 ms/op 3.5625 ms/op 0.91
UintBigint32 x 100000 serialize 2.9046 ms/op 2.8000 ms/op 1.04
UintBigint64 x 100000 deserialize 4.1830 ms/op 3.9562 ms/op 1.06
UintBigint64 x 100000 serialize 1.6444 ms/op 1.6447 ms/op 1.00
UintBigint128 x 100000 deserialize 5.7712 ms/op 5.2059 ms/op 1.11
UintBigint128 x 100000 serialize 15.132 ms/op 15.082 ms/op 1.00
UintBigint256 x 100000 deserialize 8.5631 ms/op 7.9912 ms/op 1.07
UintBigint256 x 100000 serialize 44.063 ms/op 43.068 ms/op 1.02
Slice from Uint8Array x25000 1.3448 ms/op 1.2996 ms/op 1.03
Slice from ArrayBuffer x25000 15.857 ms/op 16.254 ms/op 0.98
Slice from ArrayBuffer x25000 + new Uint8Array 16.798 ms/op 15.716 ms/op 1.07
Copy Uint8Array 100000 iterate 2.6636 ms/op 2.6506 ms/op 1.00
Copy Uint8Array 100000 slice 115.43 us/op 106.41 us/op 1.08
Copy Uint8Array 100000 Uint8Array.prototype.slice.call 112.90 us/op 109.54 us/op 1.03
Copy Buffer 100000 Uint8Array.prototype.slice.call 111.88 us/op 106.80 us/op 1.05
Copy Uint8Array 100000 slice + set 220.95 us/op 199.53 us/op 1.11
Copy Uint8Array 100000 subarray + set 115.92 us/op 100.87 us/op 1.15
Copy Uint8Array 100000 slice arrayBuffer 110.58 us/op 102.70 us/op 1.08
Uint64 deserialize 100000 - iterate Uint8Array 2.0885 ms/op 2.0130 ms/op 1.04
Uint64 deserialize 100000 - by Uint32A 2.0499 ms/op 1.9866 ms/op 1.03
Uint64 deserialize 100000 - by DataView.getUint32 x2 2.0187 ms/op 2.0096 ms/op 1.00
Uint64 deserialize 100000 - by DataView.getBigUint64 5.2791 ms/op 5.1570 ms/op 1.02
Uint64 deserialize 100000 - by byte 41.010 ms/op 40.643 ms/op 1.01

by benchmarkbot/action

@matthewkeil matthewkeil changed the title feat: non simd sha256 for incompatible systems feat!: non simd sha256 for incompatible systems Jan 6, 2025
Copy link
Member

@wemeetagain wemeetagain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to use WebAssembly.validate to synchronously do "feature detection" and avoid a lot of complexity related to async initialization here.
(It appears that wasm-feature-detect is using WebAssembly.validate under the hood.)

eg:

import {wasmCode} from "./wasmCode.js";
import {wasmSimdCode} from "./wasmSimdCode.js";

const hasSimd = WebAssembly.validate(wasmSimdCode);

const _module = new WebAssembly.Module(hasSimd ? wasmSimdCode : wasmCode);

@matthewkeil matthewkeil changed the title feat!: non simd sha256 for incompatible systems feat: non simd sha256 for incompatible systems Jan 7, 2025
final,
digest,
digest64
} from "./common";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a non-simd implementation of batchHash4UintArray64s and batchHash4HashObjectInputs here (making the interface of simd and non-simd wasm module the same). Then the javascript layer won't need runtime checking and the only refactor there is the module reinitialization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants