Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Average is a pointless ranking #4

Open
greggman opened this issue May 3, 2011 · 2 comments
Open

Average is a pointless ranking #4

greggman opened this issue May 3, 2011 · 2 comments

Comments

@greggman
Copy link
Contributor

greggman commented May 3, 2011

I'm not sure I know a better ranking but average is kind of pointless. The average means a library that skips a test can score higher than one that does that test if it's a slow test. It also means a library that fast's in 6 of 7 tests but slow in 1 test might lose to a library that's worse at all tests but its average is higher. The problem is, functions will be called in vastly different frequencies in real code so a function that is slow but rarely called (like inverse) is not nearly as important as a function like multiply which is likely to be called far more often.

I don't know a better way though. I tried 2 other ways just to see. One was "Wins" where the library that wins a particular test gets 1 point and all the other libraries get 0. At the end one library will have "won" in say 3 tests, another in 2 tests, etc.. That's also not that useful since again you could win at tests that really aren't important.

Another way I tried was by 'scoring' where for each test the library that came in first got N points, second got N-1, 3rd got N-2 etc where N = then number of libraries minus 1 (so the slowest library gets zero points). That has the advantage over 'wins' in that a library that comes in second gets ranked ahead of a library that comes in 3rd. Even that's not great though. Seems like it would be good to weight each test with 'multiply' being waited the highest and inverse or vector transform being weighted lowest.

@stepheneb
Copy link
Owner

Average is a problem ... I appreciate your looking into alternatives. One element that is important to me in any alternative is that people understand it reasonably well. People understand average -- that doesn't mean people won't misinterpret the results but I'm more interested in helping people who won't. For libraries that skip a test perhaps giving a value of 0. Another improvement would be to add a checkbox for each column and let people decide which tests were included in the average.

Another problem is that there is no way to differentiate in the results when a library has been optimized for performing the result without allocating another variable (result ends up in one of the operands).

When I use these libraries sometimes I want to make a new variable implicitly and get annoyed using glmatrix because I have to do that explicitly -- but of course that means I also pay more attention writing my code about when I create another variable ... ;-)

One of the easiest results that I'm pretty confident in is using the benchmarks to compare the results in different browsers.

I'm on Mac OS X and Safari is the slowest, then FF 3.6.15, then FF4, then Chrome 10, and right now FF Minefield is the fastest (Webkit nightly is not very good either). Forgetting about the absolute values the rate of increase in in recent browser performance is astounding!

If these results cause the Safari team to focus on increasing performance that is a big win for everybody.

@stepheneb
Copy link
Owner

I've been using a bunch of quaternion math and want to add that to the benchmark also ... so figuring out a practical way to represent a useful "average" when comparing libraries that do and don't perform the operation is useful. The fact that the whole table is sortable by any column is helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants