-
Notifications
You must be signed in to change notification settings - Fork 24
Benchmark Data: 2.6 GHz Intel Core i7
Run on (8 X 2600 MHz CPU s)
2021-05-25 11:57:31
-----------------------------------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------------------------------
FloatAdd/1000 601 ns 600 ns 966998
FloatMul/1000 618 ns 616 ns 1162694
FloatMulAdd/1000 669 ns 669 ns 1010043
FloatDiv/1000 2076 ns 2066 ns 339113
FloatSqrt/1000 3771 ns 3767 ns 182594
FloatSin/1000 6681 ns 6678 ns 108583
FloatCos/1000 6659 ns 6656 ns 100638
FloatSinCos/1000 6991 ns 6989 ns 89871
FloatAtan2/1000 6938 ns 6936 ns 95564
FloatHypot/1000 3972 ns 3972 ns 176172
FloatFma/1000 3802 ns 3799 ns 187797
DoubleAdd/1000 597 ns 597 ns 1198774
DoubleMul/1000 617 ns 617 ns 1047167
DoubleMulAdd/1000 653 ns 653 ns 983657
DoubleDiv/1000 3984 ns 3984 ns 169834
DoubleSqrt/1000 5736 ns 5733 ns 111764
DoubleSin/1000 11150 ns 11145 ns 58619
DoubleCos/1000 11808 ns 11808 ns 57905
DoubleSinCos/1000 13204 ns 13203 ns 50423
DoubleAtan2/1000 24784 ns 24761 ns 28545
DoubleHypot/1000 4150 ns 4148 ns 170216
DoubleFma/1000 15133 ns 15127 ns 44164
AlmostEqual1/1000 2239 ns 2238 ns 298702
AlmostEqual2/1000 1518 ns 1518 ns 467458
AlmostEqual3/1000 1289 ns 1289 ns 541603
DiffSignsViaSignbit/1000 649 ns 649 ns 1105269
DiffSignsViaMul/1000 1136 ns 1051 ns 735070
ModuloViaTrunc/1000 2092 ns 2091 ns 326923
ModuloViaFmod/1000 7719 ns 7712 ns 77910
DotProduct/1000 900 ns 900 ns 770221
CrossProduct/1000 1422 ns 1422 ns 463254
LengthSquaredViaDotProduct/1000 691 ns 691 ns 965344
GetMagnitudeSquared/1000 692 ns 692 ns 952718
GetMagnitude/1000 2009 ns 2007 ns 322902
GetUnitVec1/1000 4265 ns 4264 ns 166011
GetUnitVec2/1000 4150 ns 4150 ns 169283
UnitVectorFromVector/1000 4293 ns 4292 ns 163087
UnitVectorFromVectorAndBack/1000 4325 ns 4324 ns 163285
UnitVecFromAngle/1000 9032 ns 9032 ns 68184
LessLength/1000 880 ns 880 ns 809249
LessFloat/1000 868 ns 867 ns 784850
LessDouble/1000 861 ns 860 ns 803296
LessEqualLength/1000 622 ns 622 ns 1035794
LessEqualFloat/1000 631 ns 631 ns 1088528
LessEqualDouble/1000 637 ns 637 ns 1052062
LesserLength/1000 574 ns 573 ns 1153099
LesserFloat/1000 586 ns 586 ns 1193928
LesserDouble/1000 606 ns 606 ns 1168692
LesserEqualLength/1000 1425 ns 1425 ns 475754
LesserEqualFloat/1000 1417 ns 1417 ns 461045
LesserEqualDouble/1000 1432 ns 1432 ns 466390
MinLength/1000 1496 ns 1496 ns 441095
MinFloat/1000 1483 ns 1483 ns 467040
MinDouble/1000 3268 ns 3268 ns 210583
IntervalIsIntersecting/1000 2815 ns 2814 ns 227327
LengthIntervalIsIntersecting/1000 2887 ns 2886 ns 233875
AabbTestOverlap/1000 5738 ns 5730 ns 105170
AabbContains/1000 5012 ns 5011 ns 137024
AABB/1000 10437 ns 10437 ns 60966
MaxSepBetweenRel4x4/10 433 ns 433 ns 1625201
MaxSepBetweenRel4x4/100 4283 ns 4283 ns 158430
MaxSepBetweenRel4x4/1000 56669 ns 56650 ns 12007
MaxSepBetweenRel4x4/10000 614754 ns 614097 ns 1027
MaxSepBetweenRelSquaresNoStop/10 495 ns 495 ns 1308607
MaxSepBetweenRelSquaresNoStop/100 5125 ns 5122 ns 137274
MaxSepBetweenRelSquaresNoStop/1000 61015 ns 60984 ns 10887
MaxSepBetweenRelSquaresNoStop/10000 656750 ns 656552 ns 1010
MaxSepBetweenRelSquares/10 506 ns 506 ns 1329383
MaxSepBetweenRelSquares/100 5149 ns 5147 ns 133257
MaxSepBetweenRelSquares/1000 62650 ns 62605 ns 10011
MaxSepBetweenRelSquares/10000 684676 ns 684469 ns 1010
ConstructAndAssignVC 38 ns 38 ns 18151926
SolveVC 43 ns 43 ns 16168821
ManifoldForTwoSquares1 141 ns 141 ns 4835891
ManifoldForTwoSquares2 136 ns 136 ns 4708858
AsyncFutureDeferred 279 ns 279 ns 2430513
AsyncFutureAsync 24201 ns 17601 ns 39889
ThreadCreateAndDestroy 28030 ns 19048 ns 35411
MultiThreadQD 8879 ns 5062 ns 154162
MultiThreadQDE 10372 ns 5971 ns 150675
MultiThreadQDA 184 ns 183 ns 3452460
MultiThreadQDAQ 576008 ns 575893 ns 10000
WorldStepPlayRho 74 ns 74 ns 8695544
WorldStepBox2D 404 ns 404 ns 1679547
WorldStepWithStatsStaticPlayRho/0 82 ns 82 ns 8284612
WorldStepWithStatsStaticPlayRho/1 91 ns 90 ns 7438895
WorldStepWithStatsStaticPlayRho/10 136 ns 136 ns 5125014
WorldStepWithStatsStaticPlayRho/100 500 ns 499 ns 1284899
WorldStepWithStatsStaticPlayRho/1000 5157 ns 5155 ns 129649
WorldStepWithStatsStaticPlayRho/10000 67606 ns 67591 ns 9848
WorldStepWithStatsStaticBox2D/0 408 ns 408 ns 1689895
WorldStepWithStatsStaticBox2D/1 419 ns 419 ns 1668637
WorldStepWithStatsStaticBox2D/10 478 ns 477 ns 1500748
WorldStepWithStatsStaticBox2D/100 1186 ns 1185 ns 578407
WorldStepWithStatsStaticBox2D/1000 19578 ns 19577 ns 33135
WorldStepWithStatsStaticBox2D/10000 329934 ns 329890 ns 2122
DropDisksPlayRho/0 71 ns 70 ns 9348041
DropDisksPlayRho/1 302 ns 302 ns 2297213
DropDisksPlayRho/10 2822 ns 2822 ns 238294
DropDisksPlayRho/100 34234 ns 34232 ns 19994
DropDisksPlayRho/1000 384976 ns 384971 ns 1811
DropDisksPlayRho/10000 3837606 ns 3837589 ns 180
DropDisksBox2D/0 414 ns 414 ns 1663063
DropDisksBox2D/1 816 ns 816 ns 864432
DropDisksBox2D/10 5823 ns 5821 ns 105382
DropDisksBox2D/100 77042 ns 77027 ns 9496
DropDisksBox2D/1000 874668 ns 874499 ns 1216
DropDisksBox2D/10000 6266863 ns 6264453 ns 139
TumblerAdd100SquaresPlus100Steps 36296585 ns 36294737 ns 19
TumblerAdd200SquaresPlus200Steps 163829359 ns 163817750 ns 4
AddPairStressTestPlayRho400/0 2357887 ns 2357514 ns 284
AddPairStressTestPlayRho400/10 711776 ns 711502 ns 973
AddPairStressTestPlayRho400/15 834130 ns 833867 ns 844
AddPairStressTestPlayRho400/16 14925682 ns 14924130 ns 46
AddPairStressTestPlayRho400/17 31478258 ns 31453870 ns 23
AddPairStressTestPlayRho400/18 48679814 ns 48679769 ns 13
AddPairStressTestPlayRho400/19 19552737 ns 19551800 ns 35
AddPairStressTestPlayRho400/20 13991285 ns 13991125 ns 48
AddPairStressTestPlayRho400/30 4326233 ns 4325400 ns 160
AddPairStressTestBox2D400/0 1460196 ns 1460134 ns 462
AddPairStressTestBox2D400/10 844184 ns 843859 ns 838
AddPairStressTestBox2D400/15 1178649 ns 1178236 ns 597
AddPairStressTestBox2D400/16 22055256 ns 22052233 ns 30
AddPairStressTestBox2D400/17 57269872 ns 57252900 ns 10
AddPairStressTestBox2D400/18 106326260 ns 106310143 ns 7
AddPairStressTestBox2D400/19 91117641 ns 91115875 ns 8
AddPairStressTestBox2D400/20 51121647 ns 51112286 ns 14
AddPairStressTestBox2D400/30 5493215 ns 5492070 ns 128
TilesRestComboGroundPlayRho/12 22889731 ns 22889759 ns 29
TilesRestComboGroundPlayRho/20 122340526 ns 122334667 ns 6
TilesRestComboGroundPlayRho/36 1418653021 ns 1417721000 ns 1
TilesRestOneGroundPlayRho/12 16751644 ns 16751650 ns 40
TilesRestOneGroundPlayRho/20 100691677 ns 100665167 ns 6
TilesRestOneGroundPlayRho/36 1195594385 ns 1195193000 ns 1
TilesRestComboGroundBox2D/12 18891740 ns 18889556 ns 36
TilesRestComboGroundBox2D/20 116972423 ns 116960000 ns 5
TilesRestComboGroundBox2D/36 1329953352 ns 1329579000 ns 1
TilesRestOneGroundBox2D/12 16681913 ns 16677878 ns 41
TilesRestOneGroundBox2D/20 112955151 ns 112930667 ns 6
TilesRestOneGroundBox2D/36 1445217889 ns 1444923000 ns 1
Program ended with exit code: 0
I'm elated to find PlayRho often faster than Box2D and in tests I'd expect PlayRho to be faster in. These numbers are making more sense to me and I believe are more valid than the previously posted numbers.
Data from commit 00991d1ec:
Here's a run built with BENCHMARK_BOX2D
defined using Box2D 2.4.1...
Run on (8 X 2600 MHz CPU s)
2021-05-01 13:16:40
---------------------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------------------
FloatAdd/1000 622 ns 618 ns 846013
FloatMul/1000 605 ns 603 ns 1147296
FloatMulAdd/1000 675 ns 672 ns 1009140
FloatDiv/1000 2085 ns 2065 ns 342662
FloatSqrt/1000 2107 ns 2093 ns 337717
FloatSin/1000 7742 ns 7677 ns 91734
FloatCos/1000 6939 ns 6926 ns 96887
FloatSinCos/1000 7016 ns 7012 ns 101935
FloatAtan2/1000 6966 ns 6962 ns 92395
FloatHypot/1000 3950 ns 3949 ns 172518
FloatFma/1000 3727 ns 3727 ns 184470
DoubleAdd/1000 586 ns 586 ns 1137176
DoubleMul/1000 587 ns 586 ns 1197892
DoubleMulAdd/1000 651 ns 650 ns 1066147
DoubleDiv/1000 4112 ns 4108 ns 171234
DoubleSqrt/1000 4010 ns 4010 ns 168225
DoubleSin/1000 11376 ns 11371 ns 59348
DoubleCos/1000 12258 ns 12256 ns 52740
DoubleSinCos/1000 13577 ns 13571 ns 50350
DoubleAtan2/1000 24229 ns 24225 ns 27788
DoubleHypot/1000 3990 ns 3990 ns 165380
DoubleFma/1000 15068 ns 15065 ns 45233
AlmostEqual1/1000 2285 ns 2285 ns 303074
AlmostEqual2/1000 1516 ns 1514 ns 471682
AlmostEqual3/1000 1350 ns 1346 ns 506120
DiffSignsViaSignbit/1000 668 ns 666 ns 1051809
DiffSignsViaMul/1000 937 ns 935 ns 808445
ModuloViaTrunc/1000 2161 ns 2158 ns 339550
ModuloViaFmod/1000 7952 ns 7930 ns 87637
DotProduct/1000 886 ns 886 ns 743779
CrossProduct/1000 871 ns 871 ns 777104
LengthSquaredViaDotProduct/1000 690 ns 689 ns 975963
GetMagnitudeSquared/1000 719 ns 719 ns 967894
GetMagnitude/1000 2166 ns 2165 ns 322336
GetUnitVec1/1000 4404 ns 4394 ns 161637
GetUnitVec2/1000 4337 ns 4331 ns 163809
UnitVectorFromVector/1000 4401 ns 4399 ns 163214
UnitVectorFromVectorAndBack/1000 4498 ns 4454 ns 157033
UnitVecFromAngle/1000 9785 ns 9768 ns 68713
LessLength/1000 695 ns 694 ns 951294
LessFloat/1000 701 ns 701 ns 1010860
LessDouble/1000 698 ns 697 ns 956480
LessEqualLength/1000 635 ns 634 ns 1054550
LessEqualFloat/1000 634 ns 634 ns 1046823
LessEqualDouble/1000 650 ns 650 ns 1018123
LesserLength/1000 590 ns 589 ns 1125764
LesserFloat/1000 625 ns 624 ns 1070189
LesserDouble/1000 637 ns 636 ns 1066699
LesserEqualLength/1000 1218 ns 1216 ns 559293
LesserEqualFloat/1000 1196 ns 1194 ns 549244
LesserEqualDouble/1000 1232 ns 1229 ns 589037
MinLength/1000 2785 ns 2784 ns 227730
MinFloat/1000 1349 ns 1348 ns 517778
MinDouble/1000 4459 ns 4457 ns 139351
IntervalIsIntersecting/1000 2105 ns 2105 ns 325906
LengthIntervalIsIntersecting/1000 2085 ns 2085 ns 313209
AabbTestOverlap/1000 4244 ns 4244 ns 160388
AabbContains/1000 3679 ns 3678 ns 190501
AABB/1000 9957 ns 9955 ns 65454
MaxSepBetweenRel4x4/10 441 ns 441 ns 1669895
MaxSepBetweenRel4x4/100 4380 ns 4378 ns 166167
MaxSepBetweenRel4x4/1000 54811 ns 54773 ns 11786
MaxSepBetweenRel4x4/10000 577634 ns 577412 ns 1074
MaxSepBetweenRelSquaresNoStop/10 501 ns 501 ns 1367882
MaxSepBetweenRelSquaresNoStop/100 5167 ns 5164 ns 121936
MaxSepBetweenRelSquaresNoStop/1000 60220 ns 60196 ns 11319
MaxSepBetweenRelSquaresNoStop/10000 680793 ns 679824 ns 991
MaxSepBetweenRelSquares/10 519 ns 519 ns 1312434
MaxSepBetweenRelSquares/100 5421 ns 5418 ns 128163
MaxSepBetweenRelSquares/1000 63116 ns 63090 ns 9704
MaxSepBetweenRelSquares/10000 679235 ns 679021 ns 979
ConstructAndAssignVC 43 ns 43 ns 16457229
SolveVC 45 ns 45 ns 15547920
ManifoldForTwoSquares1 141 ns 141 ns 4951405
ManifoldForTwoSquares2 130 ns 130 ns 5346042
AsyncFutureDeferred 279 ns 279 ns 2476228
AsyncFutureAsync 24145 ns 17427 ns 40174
ThreadCreateAndDestroy 27356 ns 18522 ns 37777
MultiThreadQD 8090 ns 4699 ns 155937
MultiThreadQDE 8775 ns 5088 ns 161510
MultiThreadQDA 149 ns 149 ns 4333453
MultiThreadQDAQ 206255 ns 206155 ns 10000
WorldStep 82 ns 82 ns 7719623
WorldStepWithStatsStatic/0 95 ns 95 ns 7156147
WorldStepWithStatsStatic/1 104 ns 104 ns 6100324
WorldStepWithStatsStatic/10 136 ns 136 ns 4745570
WorldStepWithStatsStatic/100 508 ns 508 ns 1214729
WorldStepWithStatsStatic/1000 5287 ns 5283 ns 119409
WorldStepWithStatsStatic/10000 67756 ns 67689 ns 9307
DropDisks/0 84 ns 84 ns 7815204
DropDisks/1 546 ns 546 ns 1183952
DropDisks/10 4642 ns 4641 ns 155124
DropDisks/100 62558 ns 62535 ns 10431
DropDisks/1000 3424994 ns 3424604 ns 207
DropDisks/10000 303666432 ns 303545500 ns 2
TumblerAdd100SquaresPlus100Steps 33085123 ns 33083429 ns 21
TumblerAdd200SquaresPlus200Steps 207538292 ns 207516667 ns 3
AddPairStressTestPlayRho400/0 2308535 ns 2307908 ns 293
AddPairStressTestPlayRho400/10 699213 ns 698823 ns 978
AddPairStressTestPlayRho400/15 808748 ns 808394 ns 855
AddPairStressTestPlayRho400/16 1326534 ns 1325817 ns 541
AddPairStressTestPlayRho400/17 1117005 ns 1116516 ns 620
AddPairStressTestPlayRho400/18 1120420 ns 1119748 ns 636
AddPairStressTestPlayRho400/19 1112562 ns 1111968 ns 621
AddPairStressTestPlayRho400/20 1103333 ns 1102857 ns 642
AddPairStressTestPlayRho400/30 1037366 ns 1036777 ns 664
AddPairStressTestBox2D400/0 1484612 ns 1484356 ns 450
AddPairStressTestBox2D400/10 861231 ns 860852 ns 795
AddPairStressTestBox2D400/15 1191020 ns 1190585 ns 590
AddPairStressTestBox2D400/16 22549911 ns 22531419 ns 31
AddPairStressTestBox2D400/17 60308063 ns 60299000 ns 10
AddPairStressTestBox2D400/18 104630145 ns 104585667 ns 6
AddPairStressTestBox2D400/19 100533778 ns 100503286 ns 7
AddPairStressTestBox2D400/20 52011548 ns 52003667 ns 12
AddPairStressTestBox2D400/30 5212398 ns 5210756 ns 135
TilesRestPlayRho/12 15177255 ns 15161409 ns 44
TilesRestPlayRho/20 55322913 ns 55291500 ns 12
TilesRestPlayRho/36 404504642 ns 404429000 ns 2
TilesRestBox2D/12 19082540 ns 19080417 ns 36
TilesRestBox2D/20 119508016 ns 119468167 ns 6
TilesRestBox2D/36 1370350860 ns 1369586000 ns 1
Program ended with exit code: 0
Comparing PlayRho timing data to Box2D timing data looks surprising if not downright odd. Are tests like AddPairStressTestPlayRho400
and AddPairStressTestBox2D400
supposed to be comparable? I thought they were. The data however, at least for cases like AddPairStressTestPlayRho400/18
compared to AddPairStressTestBox2D400/18
, don't appear comparable at all when looking at times of 1,120,420 ns (for PlayRho) vs. 104,630,145 ns (for Box2D) - PlayRho appears to be 100 times faster. That'd be great news to me if so but I'm not ready to bet on that being a proper interpretation of these results.
Data since update to Benchmark code (to do 1000 ops per loop etc to increase Time and comparative sensitivity):
Run on (8 X 2600 MHz CPU s)
2017-11-13 20:46:45
-------------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------------
FloatAdd/1000 600 ns 600 ns 1126307
FloatMul/1000 592 ns 592 ns 1147673
FloatDiv/1000 1986 ns 1986 ns 343762
FloatSqrt/1000 2041 ns 2040 ns 351936
FloatSin/1000 6694 ns 6692 ns 105891
FloatCos/1000 7109 ns 7106 ns 91285
FloatSinCos/1000 6986 ns 6978 ns 100611
FloatAtan2/1000 6925 ns 6921 ns 100892
FloatHypot/1000 3996 ns 3995 ns 175358
DoubleAdd/1000 575 ns 575 ns 1159401
DoubleMul/1000 593 ns 593 ns 1209169
DoubleDiv/1000 4020 ns 4020 ns 175157
DoubleSqrt/1000 4004 ns 4003 ns 174910
DoubleSin/1000 11454 ns 11448 ns 59521
DoubleCos/1000 11833 ns 11832 ns 57608
DoubleSinCos/1000 12848 ns 12848 ns 52811
DoubleAtan2/1000 25093 ns 25088 ns 27938
DoubleHypot/1000 4868 ns 4867 ns 144953
AlmostEqual1/1000 3119 ns 3117 ns 222026
AlmostEqual2/1000 1249 ns 1249 ns 552827
DiffSignsViaSignbit/1000 897 ns 896 ns 805227
DiffSignsViaMul/1000 868 ns 868 ns 819864
ModuloViaTrunc/1000 1991 ns 1991 ns 331658
ModuloViaFmod/1000 6533 ns 6532 ns 108726
DotProduct/1000 1036 ns 1032 ns 663262
CrossProduct/1000 955 ns 948 ns 773917
LengthSquaredViaDotProduct/1000 942 ns 935 ns 690397
GetLengthSquared/1000 634 ns 632 ns 1112825
GetLength/1000 2065 ns 2063 ns 316267
UnitVectorFromVector/1000 4580 ns 4579 ns 152839
UnitVectorFromVectorAndBack/1000 4540 ns 4537 ns 155200
UnitVecFromAngle/1000 7351 ns 7344 ns 93266
AABB2D/1000 6301 ns 6300 ns 109764
ConstructAndAssignVC 26 ns 26 ns 26268585
SolveVC 60 ns 60 ns 12005008
MaxSepBetweenAbsRectangles 77 ns 77 ns 8871203
MaxSepBetweenRel4x4 86 ns 86 ns 8293249
MaxSepBetweenRel2_4x4 87 ns 87 ns 8216059
MaxSepBetweenRelRectanglesNoStop 98 ns 98 ns 7201720
MaxSepBetweenRelRectangles2NoStop 100 ns 100 ns 6962542
MaxSepBetweenRelRectangles 102 ns 102 ns 6847975
MaxSepBetweenRelRectangles2 102 ns 102 ns 6785247
ManifoldForTwoSquares1 143 ns 143 ns 4798070
ManifoldForTwoSquares2 145 ns 145 ns 4766769
AsyncFutureDeferred 274 ns 274 ns 2564920
AsyncFutureAsync 21610 ns 19293 ns 36708
ThreadCreateAndDestroy 24111 ns 18605 ns 37631
MultiThreadQD 13942 ns 7346 ns 98785
WorldStep 67 ns 66 ns 10270706
WorldStepWithStatsStatic/0 79 ns 79 ns 9132420
WorldStepWithStatsStatic/1 82 ns 82 ns 8655118
WorldStepWithStatsStatic/10 111 ns 111 ns 6095384
WorldStepWithStatsStatic/100 339 ns 339 ns 1992179
WorldStepWithStatsStatic/1000 5643 ns 5643 ns 114533
WorldStepWithStatsStatic/10000 90150 ns 90142 ns 7774
DropDisks/0 65 ns 65 ns 10901559
DropDisks/1 966 ns 966 ns 785334
DropDisks/10 9024 ns 9024 ns 74243
DropDisks/100 93724 ns 93722 ns 7073
DropDisks/1000 982572 ns 982564 ns 707
DropDisks/10000 10322862 ns 10322389 ns 72
TumblerAdd100SquaresPlus100Steps 46438637 ns 46422733 ns 15
TumblerAdd200SquaresPlus200Steps 207257355 ns 207238000 ns 3
AddPairStressTest400/0 2929408 ns 2926643 ns 241
AddPairStressTest400/10 1311338 ns 1310838 ns 524
AddPairStressTest400/15 1424575 ns 1424359 ns 485
AddPairStressTest400/16 19623392 ns 19617636 ns 33
AddPairStressTest400/17 46306119 ns 46284333 ns 15
AddPairStressTest400/18 73807674 ns 73802222 ns 9
AddPairStressTest400/19 26703249 ns 26701962 ns 26
AddPairStressTest400/20 21688294 ns 21677903 ns 31
AddPairStressTest400/30 7171648 ns 7171011 ns 94
TilesComesToRest/12 62321783 ns 62318909 ns 11
TilesComesToRest/20 368663703 ns 368614000 ns 2
TilesComesToRest/36 4188827956 ns 4188309000 ns 1
Program ended with exit code: 0
This dump includes some new benchmarking I'm trying out: AsyncFutureDeferred
, AsyncFutureAsync
, ThreadCreateAndDestroy
, and MultiThreadQD
.
Basically the output shows 13942 ns to be a minimum of delay incurred by synchronization via mutexes and condition variables. That's like the equivalent of over 6000 floating point divisions.
Presumably spin locks would be faster but waiting threads would be spinning a significant portion of their time. That might well tank performance then in other ways. I'll be looking into it though.
Data From Commit 9cb82a5 Onward
Run on (8 X 2600 MHz CPU s)
2017-10-06 10:55:54
--------------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------------
FloatAdd 2 ns 2 ns 327078349
FloatMult 2 ns 2 ns 338966636
FloatDiv 2 ns 2 ns 356350160
FloatAlmostEqual1 2 ns 2 ns 347341104
FloatAlmostEqual2 2 ns 2 ns 331114864
DifferentSignsViaSignbit 2 ns 2 ns 366114531
DifferentSignsViaMultiplication 2 ns 2 ns 342141021
FloatPositiveDivTrunc 3 ns 3 ns 271865776
FloatPositiveDivModf 4 ns 4 ns 174288033
FloatPositiveFmod 7 ns 7 ns 107866554
FloatSqrt 2 ns 2 ns 348974016
FloatSin 8 ns 8 ns 93304720
FloatCos 6 ns 6 ns 114731528
FloatSinCos 7 ns 7 ns 106943702
FloatAtan2 7 ns 7 ns 93400582
AABB 2 ns 2 ns 342759212
DotProduct 2 ns 2 ns 355351595
CrossProduct 2 ns 2 ns 373612297
LengthSquaredViaDotProduct 2 ns 2 ns 345004337
GetLengthSquared 2 ns 2 ns 365437925
GetLength 2 ns 2 ns 345791715
hypot 4 ns 4 ns 172946263
UnitVectorFromVector 2 ns 2 ns 340183991
UnitVectorFromVectorAndBack 2 ns 2 ns 344381417
UnitVecFromAngle 8 ns 8 ns 85136401
TwoRandValues 13 ns 13 ns 54497186
FloatAddTwoRand 13 ns 13 ns 53383768
FloatMultTwoRand 13 ns 13 ns 53092647
FloatDivTwoRand 13 ns 13 ns 53561864
FloatSqrtTwoRand 14 ns 14 ns 52119786
FloatSinTwoRand 34 ns 34 ns 20443328
FloatCosTwoRand 34 ns 34 ns 20151192
FloatAtan2TwoRand 27 ns 27 ns 25378686
ThreeRandValues 19 ns 19 ns 35886946
FloatAlmostEqualThreeRand1 21 ns 21 ns 33355729
DoubleAddTwoRand 13 ns 13 ns 49690499
DoubleMultTwoRand 13 ns 13 ns 51431258
DoubleDivTwoRand 16 ns 16 ns 43607458
DoubleSqrtTwoRand 16 ns 16 ns 42215977
DoubleSinTwoRand 43 ns 43 ns 16010210
DoubleCosTwoRand 43 ns 43 ns 16229099
DoubleAtan2TwoRand 60 ns 60 ns 11848141
LengthSquaredViaDotProductTwoRand 13 ns 13 ns 53577442
GetLengthSquaredTwoRand 13 ns 13 ns 52739834
GetLengthTwoRand 13 ns 13 ns 53785335
hypotTwoRand 14 ns 14 ns 50245485
UnitVectorFromVectorTwoRand 18 ns 18 ns 37860759
UnitVectorFromVectorAndBackTwoRand 17 ns 17 ns 39193948
FourRandValues 25 ns 25 ns 27369837
DotProductFourRand 26 ns 26 ns 26983270
CrossProductFourRand 27 ns 27 ns 24968433
ConstructAndAssignVC 31 ns 31 ns 22246093
SolveVC 41 ns 41 ns 16855042
DefaultWorldStep 52 ns 52 ns 13615234
MaxSepBetweenAbsRectangles 58 ns 58 ns 12027698
MaxSepBetweenRel4x4 85 ns 85 ns 8312947
MaxSepBetweenRel2_4x4 85 ns 85 ns 8365601
MaxSepBetweenRelRectanglesNoStop 92 ns 92 ns 7409681
MaxSepBetweenRelRectangles2NoStop 92 ns 92 ns 7181034
MaxSepBetweenRelRectangles 96 ns 96 ns 7327848
MaxSepBetweenRelRectangles2 96 ns 96 ns 7373621
ManifoldForTwoSquares1 130 ns 130 ns 5288807
ManifoldForTwoSquares2 130 ns 130 ns 4988420
malloc_free_random_size 226 ns 226 ns 2999991
random_malloc_free_100 68614 ns 68594 ns 11846
TumblerAdd100Squares200Steps 41726668 ns 41725529 ns 17
AddPairStressTest/0 2922759 ns 2922646 ns 237
AddPairStressTest/10 1315100 ns 1314162 ns 537
AddPairStressTest/15 1464151 ns 1464025 ns 473
AddPairStressTest/16 21479481 ns 21477125 ns 32
AddPairStressTest/17 45215525 ns 45215750 ns 16
AddPairStressTest/18 74331938 ns 74332143 ns 7
AddPairStressTest/19 25070886 ns 25070143 ns 28
AddPairStressTest/20 20248769 ns 20248000 ns 34
AddPairStressTest/30 6922205 ns 6921480 ns 102
TilesComesToRest12 49319442 ns 49302077 ns 13
TilesComesToRest20 343918694 ns 343914000 ns 2
TilesComesToRest36 4089570599 ns 4089158000 ns 1
Data From Commits After 9d48f3d
Run on (8 X 2600 MHz CPU s)
2017-10-03 11:55:01
--------------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------------
FloatAdd 2 ns 2 ns 327775202
FloatMult 2 ns 2 ns 338419292
FloatDiv 2 ns 2 ns 389256520
FloatAlmostEqual1 2 ns 2 ns 341158868
FloatAlmostEqual2 2 ns 2 ns 327148666
DifferentSignsViaSignbit 2 ns 2 ns 376534378
DifferentSignsViaMultiplication 2 ns 2 ns 333703585
FloatSqrt 2 ns 2 ns 370772531
FloatSin 8 ns 8 ns 115403004
FloatCos 6 ns 6 ns 114162698
FloatSinCos 7 ns 7 ns 96168377
FloatAtan2 7 ns 7 ns 95628415
AABB 2 ns 2 ns 327607994
DotProduct 2 ns 2 ns 327808972
CrossProduct 2 ns 2 ns 340012143
LengthSquaredViaDotProduct 2 ns 2 ns 387496056
GetLengthSquared 2 ns 2 ns 346714877
GetLength 2 ns 2 ns 379294837
hypot 4 ns 4 ns 169325070
UnitVectorFromVector 2 ns 2 ns 343576831
UnitVectorFromVectorAndBack 2 ns 2 ns 337958537
UnitVecFromAngle 8 ns 8 ns 83961042
TwoRandValues 13 ns 13 ns 51214141
FloatAddTwoRand 14 ns 14 ns 48963375
FloatMultTwoRand 14 ns 14 ns 49527367
FloatDivTwoRand 13 ns 13 ns 51881059
FloatSqrtTwoRand 14 ns 14 ns 49899844
FloatSinTwoRand 36 ns 36 ns 19809770
FloatCosTwoRand 35 ns 35 ns 20208611
FloatAtan2TwoRand 28 ns 28 ns 25143317
ThreeRandValues 19 ns 19 ns 35288661
FloatAlmostEqualThreeRand1 21 ns 21 ns 31910868
DoubleAddTwoRand 13 ns 13 ns 52865698
DoubleMultTwoRand 13 ns 13 ns 52251284
DoubleDivTwoRand 16 ns 16 ns 43663219
DoubleSqrtTwoRand 16 ns 16 ns 41954102
DoubleSinTwoRand 45 ns 45 ns 15446196
DoubleCosTwoRand 43 ns 43 ns 16207606
DoubleAtan2TwoRand 60 ns 60 ns 11787290
LengthSquaredViaDotProductTwoRand 13 ns 13 ns 53052006
GetLengthSquaredTwoRand 13 ns 13 ns 51262898
GetLengthTwoRand 13 ns 13 ns 51666617
hypotTwoRand 14 ns 14 ns 49736044
UnitVectorFromVectorTwoRand 18 ns 18 ns 38235696
UnitVectorFromVectorAndBackTwoRand 18 ns 18 ns 38634118
FourRandValues 26 ns 26 ns 27281833
DotProductFourRand 26 ns 26 ns 27315687
CrossProductFourRand 26 ns 26 ns 26726228
ConstructAndAssignVC 31 ns 31 ns 22571544
SolveVC 41 ns 41 ns 16757918
MaxSepBetweenAbsRectangles 60 ns 60 ns 12110098
MaxSepBetweenRel4x4 84 ns 84 ns 8041356
MaxSepBetweenRel2_4x4 87 ns 87 ns 8324414
MaxSepBetweenRelRectanglesNoStop 92 ns 92 ns 7739425
MaxSepBetweenRelRectangles2NoStop 92 ns 92 ns 7472645
MaxSepBetweenRelRectangles 97 ns 97 ns 7165377
MaxSepBetweenRelRectangles2 96 ns 96 ns 6513931
ManifoldForTwoSquares1 137 ns 137 ns 5268744
ManifoldForTwoSquares2 129 ns 129 ns 5546971
malloc_free_random_size 242 ns 242 ns 3069435
random_malloc_free_100 56161 ns 56158 ns 12456
TumblerAdd100Squares200Steps 42207550 ns 42180875 ns 16
TilesComesToRest12 49431563 ns 49427308 ns 13
TilesComesToRest20 344701073 ns 344699500 ns 2
TilesComesToRest36 4127377348 ns 4126757000 ns 1
Program ended with exit code: 0
Data from commit(s) after 31c0890:
Run on (8 X 2600 MHz CPU s)
2017-09-29 10:20:33
--------------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------------
FloatAdd 2 ns 2 ns 330498912
FloatMult 2 ns 2 ns 352011747
FloatDiv 2 ns 2 ns 363795111
FloatAlmostEqual1 2 ns 2 ns 345177865
FloatAlmostEqual2 2 ns 2 ns 342243061
DifferentSignsViaSignbit 2 ns 2 ns 369779506
DifferentSignsViaMultiplication 2 ns 2 ns 334987534
FloatSqrt 2 ns 2 ns 362690542
FloatSin 7 ns 7 ns 106169993
FloatCos 6 ns 6 ns 113547885
FloatAtan2 7 ns 7 ns 94776463
AABB 2 ns 2 ns 326116834
DotProduct 2 ns 2 ns 345569795
CrossProduct 2 ns 2 ns 358981723
LengthSquaredViaDotProduct 2 ns 2 ns 341718453
GetLengthSquared 2 ns 2 ns 347983436
GetLength 2 ns 2 ns 342794460
hypot 4 ns 4 ns 171899365
UnitVectorFromVector 2 ns 2 ns 350966914
UnitVectorFromVectorAndBack 2 ns 2 ns 330336707
UnitVecFromAngle 9 ns 9 ns 81987374
TwoRandValues 13 ns 13 ns 51868373
FloatAddTwoRand 13 ns 13 ns 52402270
FloatMultTwoRand 13 ns 13 ns 54164893
FloatDivTwoRand 13 ns 13 ns 52201019
FloatSqrtTwoRand 13 ns 13 ns 53471442
FloatSinTwoRand 34 ns 34 ns 20580729
FloatCosTwoRand 34 ns 34 ns 20569904
FloatAtan2TwoRand 27 ns 27 ns 25373719
ThreeRandValues 19 ns 19 ns 35111655
FloatAlmostEqualThreeRand1 21 ns 21 ns 32363357
DoubleAddTwoRand 13 ns 13 ns 54338123
DoubleMultTwoRand 13 ns 13 ns 51393498
DoubleDivTwoRand 16 ns 16 ns 41824254
DoubleSqrtTwoRand 16 ns 16 ns 42307576
DoubleSinTwoRand 43 ns 43 ns 16201079
DoubleCosTwoRand 43 ns 43 ns 16096320
DoubleAtan2TwoRand 63 ns 63 ns 11147384
LengthSquaredViaDotProductTwoRand 13 ns 13 ns 50648298
GetLengthSquaredTwoRand 13 ns 13 ns 53703221
GetLengthTwoRand 13 ns 13 ns 52580185
hypotTwoRand 14 ns 14 ns 51035287
UnitVectorFromVectorTwoRand 18 ns 18 ns 38992001
UnitVectorFromVectorAndBackTwoRand 18 ns 18 ns 38305161
FourRandValues 25 ns 25 ns 27401228
DotProductFourRand 26 ns 26 ns 27014406
CrossProductFourRand 26 ns 26 ns 26520275
ConstructAndAssignVC 31 ns 31 ns 22693895
SolveVC 42 ns 41 ns 16630752
MaxSepBetweenAbsRectangles 62 ns 62 ns 11212558
MaxSepBetweenRel4x4 75 ns 75 ns 9509061
MaxSepBetweenRel2_4x4 74 ns 74 ns 9519535
MaxSepBetweenRelRectanglesNoStop 92 ns 92 ns 7637752
MaxSepBetweenRelRectangles2NoStop 93 ns 93 ns 7434549
MaxSepBetweenRelRectangles 97 ns 97 ns 7321869
MaxSepBetweenRelRectangles2 98 ns 98 ns 7262692
ManifoldForTwoSquares1 125 ns 125 ns 5393411
ManifoldForTwoSquares2 121 ns 121 ns 5813374
malloc_free_random_size 232 ns 232 ns 3142946
random_malloc_free_100 66334 ns 66280 ns 8739
TilesComesToRest12 50247628 ns 50212538 ns 13
TilesComesToRest20 344614178 ns 344206500 ns 2
TilesComesToRest36 4167957522 ns 4164765000 ns 1
Program ended with exit code: 0
Data from around commit 0599bd9:
Run on (8 X 2600 MHz CPU s)
2017-08-08 20:51:19
--------------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------------
FloatAdd 2 ns 2 ns 368982294
FloatMult 2 ns 2 ns 366233291
FloatDiv 2 ns 2 ns 358505544
FloatSqrt 2 ns 2 ns 382018915
FloatSin 7 ns 7 ns 113156916
FloatCos 6 ns 6 ns 110023105
FloatAtan2 7 ns 7 ns 99072960
DotProduct 2 ns 2 ns 373072680
CrossProduct 2 ns 2 ns 363119508
LengthSquaredViaDotProduct 2 ns 2 ns 351751724
GetLengthSquared 2 ns 2 ns 379617779
GetLength 2 ns 2 ns 384497078
hypot 4 ns 4 ns 175868792
UnitVectorFromVector 2 ns 2 ns 294130005
UnitVectorFromVectorAndBack 2 ns 2 ns 297693724
UnitVecFromAngle 8 ns 8 ns 80838877
TwoRandValues 13 ns 13 ns 54730686
FloatAddTwoRand 13 ns 13 ns 54026102
FloatMultTwoRand 13 ns 13 ns 53356505
FloatDivTwoRand 13 ns 13 ns 53571702
FloatSqrtTwoRand 13 ns 13 ns 53826280
FloatSinTwoRand 36 ns 36 ns 18716828
FloatCosTwoRand 35 ns 35 ns 20197240
FloatAtan2TwoRand 28 ns 28 ns 25564146
DoubleAddTwoRand 14 ns 14 ns 53771288
DoubleMultTwoRand 14 ns 14 ns 51642604
DoubleDivTwoRand 18 ns 17 ns 40536706
DoubleSqrtTwoRand 18 ns 18 ns 38975718
DoubleSinTwoRand 49 ns 49 ns 14750226
DoubleCosTwoRand 48 ns 48 ns 14213313
DoubleAtan2TwoRand 69 ns 69 ns 10425199
LengthSquaredViaDotProductTwoRand 14 ns 14 ns 50312657
GetLengthSquaredTwoRand 15 ns 15 ns 48167237
GetLengthTwoRand 14 ns 14 ns 50834780
hypotTwoRand 15 ns 15 ns 44679900
UnitVectorFromVectorTwoRand 17 ns 17 ns 43387444
UnitVectorFromVectorAndBackTwoRand 17 ns 17 ns 41706884
FourRandValues 28 ns 28 ns 24096717
DotProductFourRand 29 ns 29 ns 24427097
CrossProductFourRand 29 ns 29 ns 24291133
ConstructAndAssignVC 33 ns 33 ns 20916126
SolveVC 44 ns 44 ns 15413987
MaxSepBetweenAbsRectangles 74 ns 74 ns 9816020
MaxSepBetweenRel4x4 80 ns 80 ns 8796069
MaxSepBetweenRel2_4x4 82 ns 82 ns 8408913
MaxSepBetweenRelRectanglesNoStop 104 ns 104 ns 6918432
MaxSepBetweenRelRectangles2NoStop 103 ns 103 ns 6850119
MaxSepBetweenRelRectangles 104 ns 104 ns 6884275
MaxSepBetweenRelRectangles2 100 ns 100 ns 7195058
ManifoldForTwoSquares1 156 ns 155 ns 4617475
ManifoldForTwoSquares2 157 ns 157 ns 4675863
TilesComesToRest12 57637128 ns 57506769 ns 13
TilesComesToRest20 375657450 ns 375055500 ns 2
TilesComesToRest36 3974857636 ns 3973611000 ns 1
Program ended with exit code: 0