Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] cucim v25.02 #827

Merged
merged 17 commits into from
Feb 13, 2025
Merged

[RELEASE] cucim v25.02 #827

merged 17 commits into from
Feb 13, 2025

Conversation

raydouglass
Copy link
Member

❄️ Code freeze for branch-25.02 and v25.02 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-25.02 until release (merging of this PR).

What is the purpose of this PR?

  • Update documentation
  • Allow testing for the new release
  • Enable a means to merge branch-25.02 into main for the release

raydouglass and others added 16 commits November 15, 2024 09:25
By default, CI runs on draft PRs. This leads to many CI runs that may be unnecessary.

With this PR's change to `.github/copy-pr-bot.yaml`, an `/ok to test` comment from a trusted user is required to trigger CI on draft PRs. Non-draft PRs will run CI by default, assuming that all commits are signed by trusted users. Otherwise an `/ok to test` is required (as before) -- see the `copy-pr-bot` docs at https://docs.gha-runners.nvidia.com/apps/copy-pr-bot/ for more information.

Part of rapidsai/build-planning#123.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #798
Forward-merge branch-24.12 into branch-25.02
Update version references in breaking-change trigger workflow
Forward-merge branch-24.12 into branch-25.02
…809)

This MR is purely to fuse many separate kernels into a single elementwise kernel for each of the "color distance" functions. This is expected to result in substantial performance improvement by reduced kernel launch overhead and because a single pass through the image is much more memory efficient than many separate kernel calls.

There is not expected to be any change in behavior (existing tests must continue to pass)

## Benchmark Results

I added benchmarks for these functions and compared the results before and after this change.

The acceleration values are the relative speedup as compared to the scikit-image implementation

### For a pair of 32-bit LAB images of shape: (512, 512, 3)

function         | acceleration (old) | acceleration (new)
-----------------|--------------------|-------------------
deltaE_cie76     |   5.49             |   21.85
deltaE_ciede94   |   37.60            |   65.09
deltaE_ciede2000 |   12.54            |  109.71
deltaE_cmc       |   27.18            |   89.23

### For a pair of 32-bit LAB images of shape: (3840, 2160, 3)

function         | acceleration (old) | acceleration (new)
-----------------|--------------------|-------------------
deltaE_cie76     |   31.12            |  250.74
deltaE_ciede94   |   79.15            |  117.50
deltaE_ciede2000 |   71.36            |  132.73
deltaE_cmc       |   70.65            |  154.83

Authors:
  - Gregory Lee (https://github.com/grlee77)

Approvers:
  - Gigon Bae (https://github.com/gigony)

URL: #809
The upcoming CuPy 14.0 release has some breaking change in dtype handling, etc. due to changes made to mirror NumPy 2.0 behavior. 

Fortunately most of cuCIM was still working as is when I tested it locally. The changes made here are what was required for all tests to pass when using the current `main` branch of CuPy (14.0 is not yet released)

Authors:
  - Gregory Lee (https://github.com/grlee77)

Approvers:
  - Gigon Bae (https://github.com/gigony)

URL: #808
The pointer variable is used after releasing memory with `free()`, which is a bug that needs to be addressed.

Without this fix, the benchmark code may fail to build or run correctly, resulting in the following error ([link](https://github.com/rapidsai/cucim/actions/runs/12753155354/job/35544254443?pr=811))
```
In function 'void benchmark::DoNotOptimize(Tp&) [with Tp = char*]',
    inlined from 'void string_strcpy(benchmark::State&)' at /opt/conda/conda-bld/work/benchmarks/primitives.cpp:110:33:
/opt/conda/conda-bld/work/build-release/_deps/deps-googlebenchmark-src/src/../include/benchmark/benchmark.h:322:3: error: pointer used after 'void free(void*)' [-Werror=use-after-free]
  322 |   asm volatile("" : "+m,r"(value) : : "memory");
      |   ^~~
```
It seems that the updated version of GCC is now capable of detecting this potential issue.  

This commit resolves the issue by ensuring the memory is released only after `benchmark::DoNotOptimize()` is called.

This PR handles a build issue related to #811
- #811

Authors:
  - Gigon Bae (https://github.com/gigony)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - https://github.com/jakirkham

URL: #812
This MR updates for consistency with scikit-image v0.25 (released on Dec 13, 2024).

The main API change is deprecating footprint functions `square`, `rectangle` and `cube` in favor of a general `footprint_rectangle` method. 

The supported version range of various dependencies have also been bumped.

Authors:
  - Gregory Lee (https://github.com/grlee77)

Approvers:
  - Gigon Bae (https://github.com/gigony)
  - Bradley Dice (https://github.com/bdice)
  - https://github.com/jakirkham

URL: #806
conda-forge is using GCC 13 for CUDA 12 builds. This PR updates CUDA 12 conda builds to use GCC 13, for alignment.

These PRs should be merged in a specific order, see rapidsai/build-planning#129 for details.

Authors:
  - Bradley Dice (https://github.com/bdice)
  - https://github.com/jakirkham

Approvers:
  - James Lamb (https://github.com/jameslamb)
  - https://github.com/jakirkham

URL: #811
This PR uses CUDA 12.8.0 to build and test.

xref: rapidsai/build-planning#139

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #815
`shellcheck` is a fast, static analysis tool for shell scripts. It's good at
flagging up unused variables, unintentional glob expansions, and other potential
execution and security headaches that arise from the wonders of `bash` (and
other shlangs).                   
                                                                                                              
This PR adds a `pre-commit` hook to run `shellcheck` on all of the `sh-lang`
files in the `ci/` directory, and the changes requested by `shellcheck` to make
the existing files pass the check.                              
                                                                                                              
xref: rapidsai/build-planning#135

Authors:
  - Gil Forsyth (https://github.com/gforsyth)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #814
This PR points the shared workflow branches back to the default 25.02 branches.

xref: rapidsai/build-planning#139

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #818
I contributed user-specified `axes` support to the scipy.ndimage API (for filtering and morphology functions). SciPy 0.15 is the first public release with these changes.

I originally developed this GPU port of the code in a local cuCIM branch, but submitted it upstream to CuPy and got it merged there before opening this MR. See cupy/cupy#8858 (will first appear in CuPy 14.0).

The test cases here were created before I created the CuPy MR and were later updated/integrated into CuPy's existing style in the MR there. I think we may as well keep the extra tests here as a sanity check that the vendored code continues to work as expected.

## Additional testing
A few miscellaneous other updates were copied over from CuPy so that the full `cupyx.scipy.ndimage`  test suite patches with the vendored ndimage from cuCIM when applying the following patch to CuPy's test suite (monkey-patches the module to use the one from cuCIM)

```diff
diff --git a/tests/cupyx_tests/scipy_tests/ndimage_tests/test_distance_transform.py b/tests/cupyx_tests/scipy_tests/ndimage_tests/test_distance_transform.py
index 97b041c0c..92abce3cb 100644
--- a/tests/cupyx_tests/scipy_tests/ndimage_tests/test_distance_transform.py
+++ b/tests/cupyx_tests/scipy_tests/ndimage_tests/test_distance_transform.py
@@ -6,6 +6,8 @@ import pytest
 import cupy
 from cupy import testing
 import cupyx.scipy.ndimage  # NOQA
+from cucim.skimage._vendored import ndimage as ndi
+cupyx.scipy.ndimage = ndi
 
 try:
     import scipy.ndimage  # NOQA
diff --git a/tests/cupyx_tests/scipy_tests/ndimage_tests/test_filters.py b/tests/cupyx_tests/scipy_tests/ndimage_tests/test_filters.py
index 78cb7c4d0..24726965f 100644
--- a/tests/cupyx_tests/scipy_tests/ndimage_tests/test_filters.py
+++ b/tests/cupyx_tests/scipy_tests/ndimage_tests/test_filters.py
@@ -8,6 +8,8 @@ from cupy.cuda import runtime
 from cupy import testing
 from cupy.exceptions import AxisError
 import cupyx.scipy.ndimage  # NOQA
+from cucim.skimage._vendored import ndimage as ndi
+cupyx.scipy.ndimage = ndi
 
 try:
     import scipy.ndimage  # NOQA
diff --git a/tests/cupyx_tests/scipy_tests/ndimage_tests/test_fourier.py b/tests/cupyx_tests/scipy_tests/ndimage_tests/test_fourier.py
index 5a897a4b8..361b612bb 100644
--- a/tests/cupyx_tests/scipy_tests/ndimage_tests/test_fourier.py
+++ b/tests/cupyx_tests/scipy_tests/ndimage_tests/test_fourier.py
@@ -8,6 +8,8 @@ from cupy.exceptions import AxisError
 import cupyx.scipy.fft  # NOQA
 import cupyx.scipy.fftpack  # NOQA
 import cupyx.scipy.ndimage  # NOQA
+from cucim.skimage._vendored import ndimage as ndi
+cupyx.scipy.ndimage = ndi
 
 try:
     # scipy.fft only available since SciPy 1.4.0
diff --git a/tests/cupyx_tests/scipy_tests/ndimage_tests/test_interpolation.py b/tests/cupyx_tests/scipy_tests/ndimage_tests/test_interpolation.py
index 3308b28c4..16668c53b 100644
--- a/tests/cupyx_tests/scipy_tests/ndimage_tests/test_interpolation.py
+++ b/tests/cupyx_tests/scipy_tests/ndimage_tests/test_interpolation.py
@@ -6,6 +6,8 @@ from cupy.cuda import runtime
 from cupy import testing
 import cupyx.scipy.ndimage
 from cupyx.scipy.ndimage import _util
+from cucim.skimage._vendored import ndimage as ndi
+cupyx.scipy.ndimage = ndi
 
 try:
     import scipy
diff --git a/tests/cupyx_tests/scipy_tests/ndimage_tests/test_measurements.py b/tests/cupyx_tests/scipy_tests/ndimage_tests/test_measurements.py
index e1c06db15..203c3fc29 100644
--- a/tests/cupyx_tests/scipy_tests/ndimage_tests/test_measurements.py
+++ b/tests/cupyx_tests/scipy_tests/ndimage_tests/test_measurements.py
@@ -10,6 +10,8 @@ from cupy import testing
 from cupy import _util
 from cupy._core import _accelerator
 import cupyx.scipy.ndimage  # NOQA
+from cucim.skimage._vendored import ndimage as ndi
+cupyx.scipy.ndimage = ndi
 
 try:
     import scipy
diff --git a/tests/cupyx_tests/scipy_tests/ndimage_tests/test_morphology.py b/tests/cupyx_tests/scipy_tests/ndimage_tests/test_morphology.py
index 8bbcc63d8..6ee5ec86c 100644
--- a/tests/cupyx_tests/scipy_tests/ndimage_tests/test_morphology.py
+++ b/tests/cupyx_tests/scipy_tests/ndimage_tests/test_morphology.py
@@ -4,6 +4,8 @@ import pytest
 
 from cupy import testing
 import cupyx.scipy.ndimage  # NOQA
+from cucim.skimage._vendored import ndimage as ndi
+cupyx.scipy.ndimage = ndi
 
 try:
     import scipy.ndimage  # NOQA
```

Authors:
  - Gregory Lee (https://github.com/grlee77)

Approvers:
  - Gigon Bae (https://github.com/gigony)

URL: #813
Uses a retry wrapper for `pip` commands to try to alleviate CI failures due to hash mismatches that result from network hiccups

xref rapidsai/build-planning#148

This will retry failures that show up in CI like:

```
   Collecting nvidia-cublas-cu12 (from libraft-cu12==25.2.*,>=0.0.0a0)
    Downloading https://pypi.nvidia.com/nvidia-cublas-cu12/nvidia_cublas_cu12-12.8.3.14-py3-none-manylinux_2_27_aarch64.whl (604.9 MB)
       ━━━━━━━━━━━━━━━━━━━━━                 350.2/604.9 MB 229.2 MB/s eta 0:00:02
  ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
      nvidia-cublas-cu12 from https://pypi.nvidia.com/nvidia-cublas-cu12/nvidia_cublas_cu12-12.8.3.14-py3-none-manylinux_2_27_aarch64.whl#sha256=93a4e0e386cc7f6e56c822531396de8170ed17068a1e18f987574895044cd8c3 (from libraft-cu12==25.2.*,>=0.0.0a0):
          Expected sha256 93a4e0e386cc7f6e56c822531396de8170ed17068a1e18f987574895044cd8c3
               Got        849c88d155cb4b4a3fdfebff9270fb367c58370b4243a2bdbcb1b9e7e940b7be
```

Authors:
  - Gil Forsyth (https://github.com/gforsyth)

Approvers:
  - Mike Sarahan (https://github.com/msarahan)
  - Bradley Dice (https://github.com/bdice)

URL: #824
@raydouglass raydouglass requested review from a team as code owners February 7, 2025 19:31
@AyodeAwe AyodeAwe merged commit 2af27cc into main Feb 13, 2025
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants