Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enforce wheel size limits, README formatting in CI #114

Closed
wants to merge 7 commits into from
Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions ci/build_wheel.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,5 +21,7 @@ sccache --show-adv-stats
# Exclude libcuda.so.1 because we only install a driver stub
python -m auditwheel repair --exclude libcuda.so.1 -w ./final_dist ./dist/*

./ci/validate_wheel.sh final_dist

rapids-logger "Upload Wheel"
RAPIDS_PY_WHEEL_NAME="pynvjitlink_${RAPIDS_PY_CUDA_SUFFIX}" rapids-upload-wheels-to-s3 python ./final_dist
24 changes: 24 additions & 0 deletions ci/validate_wheel.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#!/bin/bash
# Copyright (c) 2024, NVIDIA CORPORATION.

set -euo pipefail

wheel_dir_relative_path=$1

python -m pip install \
'pydistcheck==0.8.*' \
twine

pyenv rehash
Copy link
Member Author

@jameslamb jameslamb Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first, I was seeing CI failures like this in this PR:

./ci/validate_wheel.sh: line 11: pydistcheck: command not found

(build link)

Think I see the root cause.

pydistcheck is available from the rapidsai/ci-wheel Docker image: rapidsai/ci-imgs#205

This project does all of its CUDA 12 builds using CUDA 12.0.1.

export BUILD_MATRIX="
- { CUDA_VER: '12.0.1', ARCH: 'amd64', PY_VER: '3.10', LINUX_VER: 'rockylinux8' }
- { CUDA_VER: '12.0.1', ARCH: 'amd64', PY_VER: '3.11', LINUX_VER: 'rockylinux8' }
- { CUDA_VER: '12.0.1', ARCH: 'amd64', PY_VER: '3.12', LINUX_VER: 'rockylinux8' }
- { CUDA_VER: '12.0.1', ARCH: 'arm64', PY_VER: '3.10', LINUX_VER: 'rockylinux8' }
- { CUDA_VER: '12.0.1', ARCH: 'arm64', PY_VER: '3.11', LINUX_VER: 'rockylinux8' }
- { CUDA_VER: '12.0.1', ARCH: 'arm64', PY_VER: '3.12', LINUX_VER: 'rockylinux8' }
"

And therefore ends up pulling rapidsai/ci-wheel:cuda12.0.1-rockylinux8-py3.12 (build link).

We stopped building CUDA 12.0.1 ci-wheel images in rapidsai/ci-imgs#194... and so no new images with that tag was published with pydistcheck in it. The latest rapidsai/ci-wheel:cuda12.0.1-rockylinux8-py3.12 was pushed about a month ago: https://hub.docker.com/r/rapidsai/ci-wheel/tags?name=cuda12.0.1-rockylinux8-py3.12

Screenshot 2024-11-13 at 6 41 44 PM

Assuming pynvjitlink needs to keep building on CUDA 12.0.1 for compatibility reasons, options I see:

For the purpose of these validation checks, just installing it at runtime should be fine... they're small libraries with few dependencies, that are quick to install.

But having this repo's CI rely on pulling images that are no longer getting updated risks it missing out on other updates (like security patches to libraries in the base image), and on the need to do more special-casing of this repo to use other things newly installed in ci-wheel images.

@bdice @KyleFromNVIDIA what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure if it’s important for this repo to have size checks. The lowest friction thing is probably to update the CI image tags. I don’t think we’ll have issues from that, please give it a try.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's helpful to have size checks for anything we're publishing to PyPI, even if we feel it's unlikely they'll grow much... they're cheap maintenance-wise, and help avoid surprises at release time.

PR to add back the 12.0.1 images: rapidsai/ci-imgs#206

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. I meant we could use something newer than 12.0.1 in this repo, not add back the 12.0.1 images.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohhhhhh I'm sorry, misunderstood what "update the CI image tags" meant! Sure, let me try that here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright I tried updating the matrix to "all builds on CUDA 12.5.1, tests on a mix of CUDA 12.0.1 and 12.5.1". Looks like the 2 test jobs on CUDA 12.5.1 are failing:

FAIL: test_atomic_add_double_global_3 (numba.cuda.tests.cudapy.test_atomics.TestCudaAtomics.test_atomic_add_double_global_3)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/conda/envs/test/lib/python3.11/site-packages/numba/cuda/tests/cudapy/test_atomics.py", line 669, in test_atomic_add_double_global_3
    self.assertCorrectFloat64Atomics(cuda_func, shared=False)
  File "/opt/conda/envs/test/lib/python3.11/site-packages/numba/cuda/tests/cudapy/test_atomics.py", line 572, in assertCorrectFloat64Atomics
    self.assertIn(f'{inst}.add.f64', asm)
AssertionError: 'red.add.f64' not found in '***BIG STRING***'

(build link)

@bdice could you help me with this? I'm not familiar with the compatibility story for this library, so not sure if failures like that are expected.

I hope we can get wheel size checks running here, but if this would take a lot of effort to debug and if you also want to avoid having a pip install twine pydistcheck on every wheel-building job (for reasons like rapidsai/shared-actions#22 (comment)), then I'd be fine with skipping this repo for the purpose of this release.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I have a good idea of where to start debugging that. @gmarkall may have to help. I think I would prefer to just skip this repository, I don't think we gain much from adding the size checks right now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright yeah let's skip it, this seems like a good place to cut scope in favor of all the other things we're trying to land in this release.

Let's close this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put up an issue about it: #116

I still think it'd be good to return to at some point.


rapids-logger "validate packages with 'pydistcheck'"

pydistcheck \
--inspect \
"$(echo ${wheel_dir_relative_path}/*.whl)"

rapids-logger "validate packages with 'twine'"

twine check \
--strict \
"$(echo ${wheel_dir_relative_path}/*.whl)"
8 changes: 8 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -56,5 +56,13 @@ provider = "scikit_build_core.metadata.regex"
input = "pynvjitlink/VERSION"
regex = "(?P<value>.*)"

[tool.pydistcheck]
select = [
"distro-too-large-compressed",
]

# PyPI limit is 100 MiB, fail CI before we get too close to that
max_allowed_size_compressed = '75M'

[tool.pytest.ini_options]
xfail_strict = true