Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add py-rattler #1445

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Conversation

HaoZeke
Copy link
Member

@HaoZeke HaoZeke commented Nov 25, 2024

As it says on the tin. Some notes:

  • This is much faster than both libmambapy and conda
  • It is more featureful than virtualenv (supports installing python versions)
  • Has a well documented API
  • Is supported by Python >= 3.8

Regarding the version bump; asv itself is no longer installed into the project, only asv-runner, which is used to actually drive the benchmarks, and that is still pure Python 3.7 as I recall, so it should be fine.

@HaoZeke HaoZeke marked this pull request as draft November 25, 2024 14:26
@HaoZeke HaoZeke force-pushed the add_rattler branch 6 times, most recently from 4359fc3 to 981ee0a Compare December 2, 2024 05:41
@HaoZeke
Copy link
Member Author

HaoZeke commented Dec 2, 2024

I'll be splitting this PR into issues and PRs. However, for the section remaining (implementing rattler):

mkdir tmp_asv; cd tmp_asv
pipx run asv quickstart # Choose the first defaults
pipx run pdm init # Also the defaults

We'll need to add the channels necessary for mamba, i.e.

diff --git c/asv.conf.json w/asv.conf.json
index 32946ee..2933f54 100644
--- c/asv.conf.json
+++ w/asv.conf.json
@@ -69,7 +69,7 @@
 
     // The list of conda channel names to be searched for benchmark
     // dependency packages in the specified order
-    // "conda_channels": ["conda-forge", "defaults"],
+    "conda_channels": ["conda-forge"],
 
     // A conda environment file that is used for environment creation.
     // "conda_environment_file": "environment.yml",

Timing data

This is a "pseudo-cold" cache, in that I've not removed local caches (e.g. ~, so it isn't as accurate as it ought to be.

hyperfine 'asv run -E rattler' 'asv run -E mamba' 'asv run -E virtualenv' 'asv run -E conda'
Benchmark 1: asv run -E rattler
  Time (mean ± σ):      3.668 s ±  5.100 s    [User: 3.057 s, System: 0.493 s]
  Range (min … max):    1.794 s … 18.177 s    10 runs
 
  Warning: The first benchmarking run for this command was significantly slower than the rest (18.177 s). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
 
Benchmark 2: asv run -E mamba
  Time (mean ± σ):      6.739 s ± 14.599 s    [User: 5.273 s, System: 0.632 s]
  Range (min … max):    1.870 s … 48.286 s    10 runs
 
  Warning: The first benchmarking run for this command was significantly slower than the rest (48.286 s). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
 
Benchmark 3: asv run -E virtualenv
  Time (mean ± σ):      3.495 s ±  4.561 s    [User: 2.895 s, System: 0.341 s]
  Range (min … max):    1.878 s … 16.470 s    10 runs
 
  Warning: The first benchmarking run for this command was significantly slower than the rest (16.470 s). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
 
Benchmark 4: asv run -E conda
  Time (mean ± σ):      6.268 s ± 13.187 s    [User: 5.160 s, System: 0.529 s]
  Range (min … max):    1.944 s … 43.799 s    10 runs
 
  Warning: The first benchmarking run for this command was significantly slower than the rest (43.799 s). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
 
Summary
  asv run -E virtualenv ran
    1.05 ± 2.00 times faster than asv run -E rattler
    1.79 ± 4.44 times faster than asv run -E conda
    1.93 ± 4.88 times faster than asv run -E mamba

Once the environments are present, as expected, there's no real difference:

hyperfine 'asv run -E rattler' 'asv run -E mamba' 'asv run -E virtualenv' 'asv run -E conda'
Benchmark 1: asv run -E rattler
  Time (mean ± σ):      2.121 s ±  0.161 s    [User: 1.838 s, System: 0.190 s]
  Range (min … max):    1.900 s …  2.336 s    10 runs
 
Benchmark 2: asv run -E mamba
  Time (mean ± σ):      2.082 s ±  0.145 s    [User: 1.765 s, System: 0.191 s]
  Range (min … max):    1.849 s …  2.342 s    10 runs
 
Benchmark 3: asv run -E virtualenv
  Time (mean ± σ):      2.122 s ±  0.095 s    [User: 1.842 s, System: 0.188 s]
  Range (min … max):    1.975 s …  2.268 s    10 runs
 
Benchmark 4: asv run -E conda
  Time (mean ± σ):      2.089 s ±  0.109 s    [User: 1.805 s, System: 0.184 s]
  Range (min … max):    1.928 s …  2.266 s    10 runs
 
Summary
  asv run -E mamba ran
    1.00 ± 0.09 times faster than asv run -E conda
    1.02 ± 0.10 times faster than asv run -E rattler
    1.02 ± 0.08 times faster than asv run -E virtualenv

virtualenv is quicker by a hair (not really statistically significant) but rattler also provides more functionality (compared to the equivalent conda and mamba helpers).

On another benchmark (asv_samples):

hyperfine 'rm -rf .asv && asv run -E rattler' 'rm -rf .asv && asv run -E mamba' 'rm -rf .asv && asv run -E virtualenv' 'rm -rf .asv && asv run -E conda' 
Benchmark 1: rm -rf .asv && asv run -E rattler
  Time (mean ± σ):     40.837 s ±  2.582 s    [User: 31.366 s, System: 7.414 s]
  Range (min … max):   38.861 s … 47.478 s    10 runs
 
Benchmark 2: rm -rf .asv && asv run -E mamba
 ⠇ Initial time measurement       ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ ETA 00:00: ⠦ ⠙ Initial time measurement       ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ ETA 00:0  Time (mean ± σ):     66.351 s ±  8.670 s    [User: 44.504 s, System: 7.894 s]
  Range (min … max):   59.911 s … 89.377 s    10 runs
 
Benchmark 3: rm -rf .asv && asv run -E virtualenv
  Time (mean ± σ):     37.522 s ±  0.880 s    [User: 28.702 s, System: 3.686 s]
  Range (min … max):   36.334 s … 39.181 s    10 runs
 
Benchmark 4: rm -rf .asv && asv run -E conda
  Time (mean ± σ):     53.831 s ±  8.633 s    [User: 41.895 s, System: 5.632 s]
  Range (min … max):   49.172 s … 77.869 s    10 runs
 
  Warning: The first benchmarking run for this command was significantly slower than the rest (77.869 s). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
 
Summary
  rm -rf .asv && asv run -E virtualenv ran
    1.09 ± 0.07 times faster than rm -rf .asv && asv run -E rattler
    1.43 ± 0.23 times faster than rm -rf .asv && asv run -E conda
    1.77 ± 0.23 times faster than rm -rf .asv && asv run -E mamba

Note that these are all extremely trivial packages, with only a single python version and numpy as requirements, larger, more complex projects are expected to benefit from this more.

@HaoZeke HaoZeke mentioned this pull request Dec 13, 2024
1 task
@HaoZeke HaoZeke requested a review from mattip December 13, 2024 17:58
@HaoZeke HaoZeke marked this pull request as ready for review December 13, 2024 17:59
@mattip
Copy link
Contributor

mattip commented Dec 14, 2024

Is CI expected to fail on macOS (python3.8, pypy3.10)?

Since in any case asv_runner changes do not have anything to do with the
statistical checks run in ASV and validated against R
@HaoZeke
Copy link
Member Author

HaoZeke commented Jan 5, 2025

Is CI expected to fail on macOS (python3.8, pypy3.10)?

Ah, no that was an upstream CI action bug r-lib/actions#950 so things should be fine now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants