-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Average is a pointless ranking #4
Comments
Average is a problem ... I appreciate your looking into alternatives. One element that is important to me in any alternative is that people understand it reasonably well. People understand average -- that doesn't mean people won't misinterpret the results but I'm more interested in helping people who won't. For libraries that skip a test perhaps giving a value of 0. Another improvement would be to add a checkbox for each column and let people decide which tests were included in the average. Another problem is that there is no way to differentiate in the results when a library has been optimized for performing the result without allocating another variable (result ends up in one of the operands). When I use these libraries sometimes I want to make a new variable implicitly and get annoyed using glmatrix because I have to do that explicitly -- but of course that means I also pay more attention writing my code about when I create another variable ... ;-) One of the easiest results that I'm pretty confident in is using the benchmarks to compare the results in different browsers. I'm on Mac OS X and Safari is the slowest, then FF 3.6.15, then FF4, then Chrome 10, and right now FF Minefield is the fastest (Webkit nightly is not very good either). Forgetting about the absolute values the rate of increase in in recent browser performance is astounding! If these results cause the Safari team to focus on increasing performance that is a big win for everybody. |
I've been using a bunch of quaternion math and want to add that to the benchmark also ... so figuring out a practical way to represent a useful "average" when comparing libraries that do and don't perform the operation is useful. The fact that the whole table is sortable by any column is helpful. |
I'm not sure I know a better ranking but average is kind of pointless. The average means a library that skips a test can score higher than one that does that test if it's a slow test. It also means a library that fast's in 6 of 7 tests but slow in 1 test might lose to a library that's worse at all tests but its average is higher. The problem is, functions will be called in vastly different frequencies in real code so a function that is slow but rarely called (like inverse) is not nearly as important as a function like multiply which is likely to be called far more often.
I don't know a better way though. I tried 2 other ways just to see. One was "Wins" where the library that wins a particular test gets 1 point and all the other libraries get 0. At the end one library will have "won" in say 3 tests, another in 2 tests, etc.. That's also not that useful since again you could win at tests that really aren't important.
Another way I tried was by 'scoring' where for each test the library that came in first got N points, second got N-1, 3rd got N-2 etc where N = then number of libraries minus 1 (so the slowest library gets zero points). That has the advantage over 'wins' in that a library that comes in second gets ranked ahead of a library that comes in 3rd. Even that's not great though. Seems like it would be good to weight each test with 'multiply' being waited the highest and inverse or vector transform being weighted lowest.
The text was updated successfully, but these errors were encountered: