Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarks to check performance #518

Open
1 of 6 tasks
duffee opened this issue Jan 7, 2025 · 4 comments
Open
1 of 6 tasks

Add benchmarks to check performance #518

duffee opened this issue Jan 7, 2025 · 4 comments

Comments

@duffee
Copy link

duffee commented Jan 7, 2025

Test PDL against Julia's loop fusion

Comparison with Programming Language Benchmark v2 will require an implementation of nqueens, sudoku and bedcov. I assume we can already do matmult out of the box. Rolling this out will go some ways towards covering our collective blushes when it comes to Perl being the slowest on the graph.

We have timing code under Basic/examples/Benchmark/ already. I think the new examples could live there, unless there's a reason to build more infrastructure that doesn't belong in the core project. Also, follow up on TODOs in documentation for PDL::MatrixOps determinant v det and all of PDL::Matrix and try to address the benchmarking comment mentioned in PDL::Indexing.

Assign this to me. I've used Hyperfine before, but I'll be in touch for the more esoteric bits of C and XS.

TODO

Write PDL benchmarks for:

  • nqueens
  • matmult
  • sudoku
  • bedcov

and then

  • Identify functions with the longest dwell times
  • Write a guide for how we generate our benchmarks

Stretch goals

@mohawk2
Copy link
Member

mohawk2 commented Jan 7, 2025

I'm assuming that last one is a Freudian slip? You put "pdl2" ;-)

There's an instant win for the one Perl one I looked at, nqueens. Near the start, it first initialises some Perl arrays, then loops through initialising them (even though they already have the right values). Just deleting that loop can only save time.

And yes, the whole point of making these benchmarks in PDL (that perform well) would be to PR them into that repo. I'm pretty sure with auto-pthreading, the PDL version would (for algorithms that can be parallelised, which my first glance at nqueens did not show) actually be faster than C. That was my finding with Fourmilab/floating_point_benchmarks#1

See #451 for performance-measuring tools and techniques.

Another target: https://github.com/FPBench/FPBench

@duffee
Copy link
Author

duffee commented Jan 21, 2025

The hard part will be to understand the fine details of the algorithms and brush up on my indexing.

I've got a good-enough matmult ready and current benchmarking is showing it over 60 times faster (still running the final timings)

I'm sure it could be improved.

@mohawk2
Copy link
Member

mohawk2 commented Jan 21, 2025

Good enough is good enough! Open that PR :-D

@mohawk2
Copy link
Member

mohawk2 commented Jan 22, 2025

The PR I meant was against plb2. However, I do think that expanding our benchmark under examples is also a good idea. The direction of travel I'd like to see is to reach parity with plb2 and PR the new PDL code into that. At that point, this issue could be closed, even if further benchmarks were to be added after that.

The further work could include bringing the "example" benchmark up to date, because it's not very current.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants