Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ensure dask_cuda.__git_commit__ is populated #1453

Merged
merged 1 commit into from
Feb 21, 2025

Conversation

jameslamb
Copy link
Member

Installing dask-cuda like this:

pip install \
  --extra-index-url=https://pypi.anaconda.org/rapidsai-wheels-nightly/simple/ \
  'dask-cuda==25.4.*,>=0.0.0a0'

The __git_commit__ attribute on the main module isn't populated:

python -c "import dask_cuda; print(dask_cuda.__git_commit__)"

The way this should work is that rapids-build-backend writes a file dask_cuda/GIT_COMMIT which is then read by this code:

try:
__git_commit__ = (
importlib.resources.files(__package__)
.joinpath("GIT_COMMIT")
.read_text()
.strip()
)
except FileNotFoundError:
__git_commit__ = ""

I think that what's happening here is this:

  • rapids-build-backend is writing that file
  • the file is not being packaged, because this project uses setuptools + a MANIFEST.in, and that MANIFEST.in does not include that file

This proposes the following:

  • add GIT_COMMIT to MANIFEST.in
  • update RAPIDS-specific pre-commit hooks to their latest versions (not related, but might as well, while we're using a CI run anyway)

Notes for Reviewers

Helpful reference for this... "Controlling files in the distribution" from the setuptools docs: https://setuptools.pypa.io/en/latest/userguide/miscellaneous.html

@jameslamb jameslamb added 2 - In Progress Currently a work in progress improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 20, 2025
Copy link

copy-pr-bot bot commented Feb 20, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@jameslamb
Copy link
Member Author

/ok to test

@TomAugspurger
Copy link
Contributor

Here's a run from a previous PR: https://github.com/rapidsai/dask-cuda/actions/runs/13332958081/job/37252862171. There's no GIT_COMMIT in the build output.

The build output from this PR is at https://github.com/rapidsai/dask-cuda/actions/runs/13439129864/job/37548802662?pr=1453#step:11.

There's a warning about no files matching GIT_COMMIT, which is maybe understandable since it's generated during the build? But we do have dask_cuda/VERSION symlinked to VERSION https://github.com/rapidsai/dask-cuda/blob/branch-25.04/dask_cuda/VERSION.

Anyway, what we really care about is that the built wheel has it, and it looks like it does:

(base) toaugspurger@dgx12:/tmp/a$ curl -LO "https://downloads.rapids.ai/ci/dask-cuda/pull-request/1453/c2cd9ba/dask-cuda_wheel_python_dask-cuda.tar.gz"
...
(base) toaugspurger@dgx12:/tmp/a$ tar xvf dask-cuda_wheel_python_dask-cuda.tar.gz
./
./dask_cuda-25.4.0a23-py3-none-any.whl
(base) toaugspurger@dgx12:/tmp/a$ unzip -l dask_cuda-25.4.0a23-py3-none-any.whl | grep GIT_COMMIT
       41  2025-02-20 15:49   dask_cuda/GIT_COMMIT

@jameslamb jameslamb marked this pull request as ready for review February 20, 2025 17:04
@jameslamb jameslamb requested a review from a team as a code owner February 20, 2025 17:04
@jameslamb jameslamb requested a review from AyodeAwe February 20, 2025 17:04
@jameslamb jameslamb removed the 2 - In Progress Currently a work in progress label Feb 20, 2025
@jameslamb
Copy link
Member Author

Yeah thank you for looking carefully at that! I think I can explain what's happening there.

The key point is this from the setuptools docs:

You can think about the build process as two stages: first the sdist will be created and then the wheel will be produced from that sdist.

(docs link)

The first pass through, creating an sdist, is what emits these warnings:

reading manifest template 'MANIFEST.in'
warning: no files found matching 'dask_cuda/GIT_COMMIT'

(build link)

That happens around here in setuptools: https://github.com/pypa/setuptools/blob/ba243756233d0afe944db6e02ddcb70064dcd22c/setuptools/_distutils/command/sdist.py#L342

And at that point, that GIT_COMMIT file doesn't exist because it hasn't been created yet.

But then later on, during wheel-building, those rules in the manifest are executed one at a time.

writing manifest file 'dask_cuda.egg-info/SOURCES.txt'
copying dask_cuda/GIT_COMMIT -> build/lib/dask_cuda
copying dask_cuda/VERSION -> build/lib/dask_cuda

(build link)

And by there, the file exists and gets packaged.

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

@jameslamb
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 30777de into rapidsai:branch-25.04 Feb 21, 2025
36 checks passed
@jameslamb jameslamb deleted the fix/git-commit branch February 21, 2025 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants