From 140e974bee09a287406c84fde96d643909bdd5b8 Mon Sep 17 00:00:00 2001
From: Gil Forsyth <gforsyth@users.noreply.github.com>
Date: Wed, 5 Feb 2025 14:49:07 -0500
Subject: [PATCH 01/11] Use `rapids-pip-retry` in CI jobs that might need
 retries (#2571)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Uses a retry wrapper for `pip` commands to try to alleviate CI failures due to hash mismatches that result from network hiccups

xref rapidsai/build-planning#148

This will retry failures that show up in CI like:

```
   Collecting nvidia-cublas-cu12 (from libraft-cu12==25.2.*,>=0.0.0a0)
    Downloading https://pypi.nvidia.com/nvidia-cublas-cu12/nvidia_cublas_cu12-12.8.3.14-py3-none-manylinux_2_27_aarch64.whl (604.9 MB)
       ━━━━━━━━━━━━━━━━━━━━━                 350.2/604.9 MB 229.2 MB/s eta 0:00:02
  ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
      nvidia-cublas-cu12 from https://pypi.nvidia.com/nvidia-cublas-cu12/nvidia_cublas_cu12-12.8.3.14-py3-none-manylinux_2_27_aarch64.whl#sha256=93a4e0e386cc7f6e56c822531396de8170ed17068a1e18f987574895044cd8c3 (from libraft-cu12==25.2.*,>=0.0.0a0):
          Expected sha256 93a4e0e386cc7f6e56c822531396de8170ed17068a1e18f987574895044cd8c3
               Got        849c88d155cb4b4a3fdfebff9270fb367c58370b4243a2bdbcb1b9e7e940b7be
```

Authors:
  - Gil Forsyth (https://github.com/gforsyth)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: https://github.com/rapidsai/raft/pull/2571
---
 ci/build_wheel.sh          | 2 +-
 ci/build_wheel_libraft.sh  | 2 +-
 ci/test_wheel_pylibraft.sh | 2 +-
 ci/test_wheel_raft_dask.sh | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/ci/build_wheel.sh b/ci/build_wheel.sh
index 976da98998..e2e8919b95 100755
--- a/ci/build_wheel.sh
+++ b/ci/build_wheel.sh
@@ -41,7 +41,7 @@ sccache --zero-stats
 
 rapids-logger "Building '${package_name}' wheel"
 
-python -m pip wheel \
+rapids-pip-retry wheel \
     -w dist \
     -v \
     --no-deps \
diff --git a/ci/build_wheel_libraft.sh b/ci/build_wheel_libraft.sh
index 8ff0da1e9a..10c69e1601 100755
--- a/ci/build_wheel_libraft.sh
+++ b/ci/build_wheel_libraft.sh
@@ -17,7 +17,7 @@ rapids-dependency-file-generator \
 | tee /tmp/requirements-build.txt
 
 rapids-logger "Installing build requirements"
-python -m pip install \
+rapids-pip-retry install \
     -v \
     --prefer-binary \
     -r /tmp/requirements-build.txt
diff --git a/ci/test_wheel_pylibraft.sh b/ci/test_wheel_pylibraft.sh
index 26f4da267f..0321e41bfb 100755
--- a/ci/test_wheel_pylibraft.sh
+++ b/ci/test_wheel_pylibraft.sh
@@ -10,7 +10,7 @@ RAPIDS_PY_WHEEL_NAME="pylibraft_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels
 
 
 # echo to expand wildcard before adding `[extra]` requires for pip
-python -m pip install \
+rapids-pip-retry install \
     ./local-libraft-dep/libraft*.whl \
     "$(echo ./dist/pylibraft*.whl)[test]"
 
diff --git a/ci/test_wheel_raft_dask.sh b/ci/test_wheel_raft_dask.sh
index c394314aac..da3b40b353 100755
--- a/ci/test_wheel_raft_dask.sh
+++ b/ci/test_wheel_raft_dask.sh
@@ -10,7 +10,7 @@ RAPIDS_PY_WHEEL_NAME="pylibraft_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels
 RAPIDS_PY_WHEEL_NAME="raft_dask_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 python ./dist
 
 # echo to expand wildcard before adding `[extra]` requires for pip
-python -m pip install -v \
+rapids-pip-retry install -v \
     ./local-libraft-dep/libraft*.whl \
     ./local-pylibraft-dep/pylibraft*.whl \
     "$(echo ./dist/raft_dask_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]"

From a436257f72416ea7f8f666aae14a55dfee111980 Mon Sep 17 00:00:00 2001
From: Michael Schellenberger Costa <miscco@nvidia.com>
Date: Thu, 6 Feb 2025 17:35:17 +0100
Subject: [PATCH 02/11] Take argument by `const&` as the input range is const
 (#2558)

Found breaking CCCL ci

Authors:
  - Michael Schellenberger Costa (https://github.com/miscco)

Approvers:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: https://github.com/rapidsai/raft/pull/2558
---
 cpp/tests/neighbors/ball_cover.cu | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cpp/tests/neighbors/ball_cover.cu b/cpp/tests/neighbors/ball_cover.cu
index ffd17d6d74..4c5b7ef1c1 100644
--- a/cpp/tests/neighbors/ball_cover.cu
+++ b/cpp/tests/neighbors/ball_cover.cu
@@ -78,7 +78,7 @@ RAFT_KERNEL count_discrepancies_kernel(value_idx* actual_idx,
 }
 
 struct is_nonzero {
-  __host__ __device__ bool operator()(uint32_t& i) { return i > 0; }
+  __host__ __device__ bool operator()(const uint32_t& i) { return i > 0; }
 };
 
 template <typename value_idx, typename value_t>

From ac827456119118438136333483faedcfb0d2dc09 Mon Sep 17 00:00:00 2001
From: Gil Forsyth <gforsyth@users.noreply.github.com>
Date: Thu, 6 Feb 2025 14:26:26 -0500
Subject: [PATCH 03/11] Add build_type input field for `test.yaml` (#2573)

Exposes `build_type` as an input in `test.yaml` so that `test.yaml` can be
manually run against a specific branch/commit as needed.

The default value is still `nightly`, and without maintainer intervention, that
is what will run each night.

xref rapidsai/build-planning#147

Authors:
  - Gil Forsyth (https://github.com/gforsyth)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: https://github.com/rapidsai/raft/pull/2573
---
 .github/workflows/test.yaml | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml
index 3f234a29f2..c8546e0ed3 100644
--- a/.github/workflows/test.yaml
+++ b/.github/workflows/test.yaml
@@ -12,13 +12,16 @@ on:
       sha:
         required: true
         type: string
+      build_type:
+        type: string
+        default: nightly
 
 jobs:
   conda-cpp-checks:
     secrets: inherit
     uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-post-build-checks.yaml@nvks-runners
     with:
-      build_type: nightly
+      build_type: ${{ inputs.build_type }}
       branch: ${{ inputs.branch }}
       date: ${{ inputs.date }}
       sha: ${{ inputs.sha }}
@@ -27,7 +30,7 @@ jobs:
     secrets: inherit
     uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-tests.yaml@nvks-runners
     with:
-      build_type: nightly
+      build_type: ${{ inputs.build_type }}
       branch: ${{ inputs.branch }}
       date: ${{ inputs.date }}
       sha: ${{ inputs.sha }}
@@ -35,7 +38,7 @@ jobs:
     secrets: inherit
     uses: rapidsai/shared-workflows/.github/workflows/conda-python-tests.yaml@nvks-runners
     with:
-      build_type: nightly
+      build_type: ${{ inputs.build_type }}
       branch: ${{ inputs.branch }}
       date: ${{ inputs.date }}
       sha: ${{ inputs.sha }}
@@ -43,7 +46,7 @@ jobs:
     secrets: inherit
     uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@nvks-runners
     with:
-      build_type: nightly
+      build_type: ${{ inputs.build_type }}
       branch: ${{ inputs.branch }}
       date: ${{ inputs.date }}
       sha: ${{ inputs.sha }}
@@ -52,7 +55,7 @@ jobs:
     secrets: inherit
     uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@nvks-runners
     with:
-      build_type: nightly
+      build_type: ${{ inputs.build_type }}
       branch: ${{ inputs.branch }}
       date: ${{ inputs.date }}
       sha: ${{ inputs.sha }}

From 307e9276d274e36e08c5c3729c9077a12829afac Mon Sep 17 00:00:00 2001
From: Bradley Dice <bdice@bradleydice.com>
Date: Fri, 7 Feb 2025 09:19:11 -0800
Subject: [PATCH 04/11] Use shared-workflows branch-25.04 (#2576)

This completes the migration to NVKS runners now that all libraries have been tested and https://github.com/rapidsai/shared-workflows/pull/273 has been merged.

xref: https://github.com/rapidsai/build-infra/issues/184

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: https://github.com/rapidsai/raft/pull/2576
---
 .github/workflows/build.yaml                  | 20 ++++++-------
 .github/workflows/pr.yaml                     | 30 +++++++++----------
 .github/workflows/test.yaml                   | 10 +++----
 .../trigger-breaking-change-alert.yaml        |  2 +-
 4 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/.github/workflows/build.yaml b/.github/workflows/build.yaml
index 7421d849a0..d2aca1307a 100644
--- a/.github/workflows/build.yaml
+++ b/.github/workflows/build.yaml
@@ -28,7 +28,7 @@ concurrency:
 jobs:
   cpp-build:
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-build.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-build.yaml@branch-25.04
     with:
       build_type: ${{ inputs.build_type || 'branch' }}
       branch: ${{ inputs.branch }}
@@ -37,7 +37,7 @@ jobs:
   python-build:
     needs: [cpp-build]
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/conda-python-build.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/conda-python-build.yaml@branch-25.04
     with:
       build_type: ${{ inputs.build_type || 'branch' }}
       branch: ${{ inputs.branch }}
@@ -46,7 +46,7 @@ jobs:
   upload-conda:
     needs: [cpp-build, python-build]
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/conda-upload-packages.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/conda-upload-packages.yaml@branch-25.04
     with:
       build_type: ${{ inputs.build_type || 'branch' }}
       branch: ${{ inputs.branch }}
@@ -56,7 +56,7 @@ jobs:
     if: github.ref_type == 'branch'
     needs: python-build
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/custom-job.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/custom-job.yaml@branch-25.04
     with:
       arch: "amd64"
       branch: ${{ inputs.branch }}
@@ -68,7 +68,7 @@ jobs:
       sha: ${{ inputs.sha }}
   wheel-build-libraft:
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-25.04
     with:
       build_type: ${{ inputs.build_type || 'branch' }}
       branch: ${{ inputs.branch }}
@@ -80,7 +80,7 @@ jobs:
   wheel-publish-libraft:
     needs: wheel-build-libraft
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-25.04
     with:
       build_type: ${{ inputs.build_type || 'branch' }}
       branch: ${{ inputs.branch }}
@@ -91,7 +91,7 @@ jobs:
   wheel-build-pylibraft:
     needs: wheel-build-libraft
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-25.04
     with:
       build_type: ${{ inputs.build_type || 'branch' }}
       branch: ${{ inputs.branch }}
@@ -101,7 +101,7 @@ jobs:
   wheel-publish-pylibraft:
     needs: wheel-build-pylibraft
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-25.04
     with:
       build_type: ${{ inputs.build_type || 'branch' }}
       branch: ${{ inputs.branch }}
@@ -112,7 +112,7 @@ jobs:
   wheel-build-raft-dask:
     needs: wheel-build-libraft
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-25.04
     with:
       build_type: ${{ inputs.build_type || 'branch' }}
       branch: ${{ inputs.branch }}
@@ -122,7 +122,7 @@ jobs:
   wheel-publish-raft-dask:
     needs: wheel-build-raft-dask
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-25.04
     with:
       build_type: ${{ inputs.build_type || 'branch' }}
       branch: ${{ inputs.branch }}
diff --git a/.github/workflows/pr.yaml b/.github/workflows/pr.yaml
index a53d4d693d..67a0d06852 100644
--- a/.github/workflows/pr.yaml
+++ b/.github/workflows/pr.yaml
@@ -28,7 +28,7 @@ jobs:
       - wheel-tests-raft-dask
       - devcontainer
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/pr-builder.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/pr-builder.yaml@branch-25.04
     if: always()
     with:
       needs: ${{ toJSON(needs) }}
@@ -46,7 +46,7 @@ jobs:
           repo: raft
   changed-files:
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/changed-files.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/changed-files.yaml@branch-25.04
     with:
       files_yaml: |
         test_cpp:
@@ -70,47 +70,47 @@ jobs:
           - '!thirdparty/LICENSES/**'
   checks:
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/checks.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/checks.yaml@branch-25.04
     with:
       enable_check_generated_files: false
   conda-cpp-build:
     needs: checks
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-build.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-build.yaml@branch-25.04
     with:
       build_type: pull-request
       node_type: cpu16
   conda-cpp-tests:
     needs: [conda-cpp-build, changed-files]
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-tests.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-tests.yaml@branch-25.04
     if: fromJSON(needs.changed-files.outputs.changed_file_groups).test_cpp
     with:
       build_type: pull-request
   conda-cpp-checks:
     needs: conda-cpp-build
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-post-build-checks.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-post-build-checks.yaml@branch-25.04
     with:
       build_type: pull-request
       enable_check_symbols: true
   conda-python-build:
     needs: conda-cpp-build
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/conda-python-build.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/conda-python-build.yaml@branch-25.04
     with:
       build_type: pull-request
   conda-python-tests:
     needs: [conda-python-build, changed-files]
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/conda-python-tests.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/conda-python-tests.yaml@branch-25.04
     if: fromJSON(needs.changed-files.outputs.changed_file_groups).test_python
     with:
       build_type: pull-request
   docs-build:
     needs: conda-python-build
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/custom-job.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/custom-job.yaml@branch-25.04
     with:
       build_type: pull-request
       node_type: "gpu-l4-latest-1"
@@ -120,7 +120,7 @@ jobs:
   wheel-build-libraft:
     needs: checks
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-25.04
     with:
       build_type: pull-request
       branch: ${{ inputs.branch }}
@@ -132,14 +132,14 @@ jobs:
   wheel-build-pylibraft:
     needs: [checks, wheel-build-libraft]
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-25.04
     with:
       build_type: pull-request
       script: ci/build_wheel_pylibraft.sh
   wheel-tests-pylibraft:
     needs: [wheel-build-pylibraft, changed-files]
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@branch-25.04
     if: fromJSON(needs.changed-files.outputs.changed_file_groups).test_python
     with:
       build_type: pull-request
@@ -147,21 +147,21 @@ jobs:
   wheel-build-raft-dask:
     needs: [checks, wheel-build-libraft]
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-25.04
     with:
       build_type: pull-request
       script: "ci/build_wheel_raft_dask.sh"
   wheel-tests-raft-dask:
     needs: [wheel-build-raft-dask, changed-files]
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@branch-25.04
     if: fromJSON(needs.changed-files.outputs.changed_file_groups).test_python
     with:
       build_type: pull-request
       script: ci/test_wheel_raft_dask.sh
   devcontainer:
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/build-in-devcontainer.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/build-in-devcontainer.yaml@branch-25.04
     with:
       arch: '["amd64"]'
       cuda: '["12.8"]'
diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml
index c8546e0ed3..dcae418cc2 100644
--- a/.github/workflows/test.yaml
+++ b/.github/workflows/test.yaml
@@ -19,7 +19,7 @@ on:
 jobs:
   conda-cpp-checks:
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-post-build-checks.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-post-build-checks.yaml@branch-25.04
     with:
       build_type: ${{ inputs.build_type }}
       branch: ${{ inputs.branch }}
@@ -28,7 +28,7 @@ jobs:
       enable_check_symbols: true
   conda-cpp-tests:
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-tests.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/conda-cpp-tests.yaml@branch-25.04
     with:
       build_type: ${{ inputs.build_type }}
       branch: ${{ inputs.branch }}
@@ -36,7 +36,7 @@ jobs:
       sha: ${{ inputs.sha }}
   conda-python-tests:
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/conda-python-tests.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/conda-python-tests.yaml@branch-25.04
     with:
       build_type: ${{ inputs.build_type }}
       branch: ${{ inputs.branch }}
@@ -44,7 +44,7 @@ jobs:
       sha: ${{ inputs.sha }}
   wheel-tests-pylibraft:
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@branch-25.04
     with:
       build_type: ${{ inputs.build_type }}
       branch: ${{ inputs.branch }}
@@ -53,7 +53,7 @@ jobs:
       script: ci/test_wheel_pylibraft.sh
   wheel-tests-raft-dask:
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/wheels-test.yaml@branch-25.04
     with:
       build_type: ${{ inputs.build_type }}
       branch: ${{ inputs.branch }}
diff --git a/.github/workflows/trigger-breaking-change-alert.yaml b/.github/workflows/trigger-breaking-change-alert.yaml
index 7b5b4810fb..9764c62c15 100644
--- a/.github/workflows/trigger-breaking-change-alert.yaml
+++ b/.github/workflows/trigger-breaking-change-alert.yaml
@@ -12,7 +12,7 @@ jobs:
   trigger-notifier:
     if: contains(github.event.pull_request.labels.*.name, 'breaking')
     secrets: inherit
-    uses: rapidsai/shared-workflows/.github/workflows/breaking-change-alert.yaml@nvks-runners
+    uses: rapidsai/shared-workflows/.github/workflows/breaking-change-alert.yaml@branch-25.04
     with:
       sender_login: ${{ github.event.sender.login }}
       sender_avatar: ${{ github.event.sender.avatar_url }}

From 1aacf2cad143f48723117bbe11ad6a4348b4e20a Mon Sep 17 00:00:00 2001
From: Mike Sarahan <msarahan@gmail.com>
Date: Fri, 7 Feb 2025 16:40:18 -0600
Subject: [PATCH 05/11] update telemetry and retarget 25.04 (#2569)

Enables telemetry as a final step in the top-level workflow. See draft docs at https://github.com/rapidsai/docs/pull/568 for more info. Part of https://github.com/rapidsai/build-infra/issues/139

Authors:
  - Mike Sarahan (https://github.com/msarahan)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: https://github.com/rapidsai/raft/pull/2569
---
 .github/workflows/pr.yaml | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/.github/workflows/pr.yaml b/.github/workflows/pr.yaml
index 67a0d06852..7a1b535966 100644
--- a/.github/workflows/pr.yaml
+++ b/.github/workflows/pr.yaml
@@ -27,11 +27,23 @@ jobs:
       - wheel-build-raft-dask
       - wheel-tests-raft-dask
       - devcontainer
+      - telemetry-setup
     secrets: inherit
     uses: rapidsai/shared-workflows/.github/workflows/pr-builder.yaml@branch-25.04
     if: always()
     with:
       needs: ${{ toJSON(needs) }}
+  telemetry-setup:
+    runs-on: ubuntu-latest
+    continue-on-error: true
+    env:
+      OTEL_SERVICE_NAME: "pr-raft"
+    steps:
+      - name: Telemetry setup
+        # This gate is here and not at the job level because we need the job to not be skipped,
+        # since other jobs depend on it.
+        if: ${{ vars.TELEMETRY_ENABLED == 'true' }}
+        uses: rapidsai/shared-actions/telemetry-dispatch-stash-base-env-vars@main
   check-nightly-ci:
     # Switch to ubuntu-latest once it defaults to a version of Ubuntu that
     # provides at least Python 3.11 (see
@@ -46,6 +58,7 @@ jobs:
           repo: raft
   changed-files:
     secrets: inherit
+    needs: telemetry-setup
     uses: rapidsai/shared-workflows/.github/workflows/changed-files.yaml@branch-25.04
     with:
       files_yaml: |
@@ -70,9 +83,11 @@ jobs:
           - '!thirdparty/LICENSES/**'
   checks:
     secrets: inherit
+    needs: telemetry-setup
     uses: rapidsai/shared-workflows/.github/workflows/checks.yaml@branch-25.04
     with:
       enable_check_generated_files: false
+      ignored_pr_jobs: telemetry-summarize
   conda-cpp-build:
     needs: checks
     secrets: inherit
@@ -160,6 +175,7 @@ jobs:
       build_type: pull-request
       script: ci/test_wheel_raft_dask.sh
   devcontainer:
+    needs: telemetry-setup
     secrets: inherit
     uses: rapidsai/shared-workflows/.github/workflows/build-in-devcontainer.yaml@branch-25.04
     with:
@@ -169,3 +185,12 @@ jobs:
         sccache -z;
         build-all -DBUILD_PRIMS_BENCH=ON --verbose;
         sccache -s;
+  telemetry-summarize:
+    # This job must use a self-hosted runner to record telemetry traces.
+    runs-on: linux-amd64-cpu4
+    needs: pr-builder
+    if: ${{ vars.TELEMETRY_ENABLED == 'true' && !cancelled() }}
+    continue-on-error: true
+    steps:
+      - name: Telemetry summarize
+        uses: rapidsai/shared-actions/telemetry-dispatch-summarize@main

From 1a8e38b4ca1e63fcfceb164d578c6ab63d2282ef Mon Sep 17 00:00:00 2001
From: Gil Forsyth <gforsyth@users.noreply.github.com>
Date: Mon, 10 Feb 2025 09:38:07 -0500
Subject: [PATCH 06/11] Add `shellcheck` to pre-commit and fix warnings (#2575)

`shellcheck` is a fast, static analysis tool for shell scripts. It's good at
flagging up unused variables, unintentional glob expansions, and other potential
execution and security headaches that arise from the wonders of `bash` (and
other shlangs).

This PR adds a `pre-commit` hook to run `shellcheck` on all of the `sh-lang`
files in the `ci/` directory, and the changes requested by `shellcheck` to make
the existing files pass the check.

xref: rapidsai/build-planning#135

Authors:
  - Gil Forsyth (https://github.com/gforsyth)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: https://github.com/rapidsai/raft/pull/2575
---
 .pre-commit-config.yaml      |  6 ++++++
 ci/build_docs.sh             |  6 ++++--
 ci/build_python.sh           |  1 -
 ci/build_wheel.sh            |  4 ++--
 ci/build_wheel_libraft.sh    |  4 +---
 ci/build_wheel_pylibraft.sh  |  4 ++--
 ci/build_wheel_raft_dask.sh  |  4 ++--
 ci/check_style.sh            |  4 ++--
 ci/checks/black_lists.sh     | 18 +++++++++---------
 ci/release/update-version.sh | 13 ++++++-------
 ci/test_cpp.sh               |  2 +-
 ci/test_wheel_pylibraft.sh   |  2 +-
 ci/test_wheel_raft_dask.sh   |  4 ++--
 ci/validate_wheel.sh         |  6 ++----
 14 files changed, 40 insertions(+), 38 deletions(-)

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index 6dfcc72417..21dc20e776 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -127,6 +127,12 @@ repos:
         hooks:
             - id: rapids-dependency-file-generator
               args: ["--clean"]
+      - repo: https://github.com/shellcheck-py/shellcheck-py
+        rev: v0.10.0.1
+        hooks:
+          - id: shellcheck
+            args: ["--severity=warning"]
+            files: ^ci/
 
 default_language_version:
       python: python3
diff --git a/ci/build_docs.sh b/ci/build_docs.sh
index aff7674892..54dbb2b599 100755
--- a/ci/build_docs.sh
+++ b/ci/build_docs.sh
@@ -7,7 +7,8 @@ rapids-logger "Create test conda environment"
 . /opt/conda/etc/profile.d/conda.sh
 
 RAPIDS_VERSION="$(rapids-version)"
-export RAPIDS_VERSION_MAJOR_MINOR="$(rapids-version-major-minor)"
+RAPIDS_VERSION_MAJOR_MINOR="$(rapids-version-major-minor)"
+export RAPIDS_VERSION_MAJOR_MINOR
 
 rapids-dependency-file-generator \
   --output conda \
@@ -31,7 +32,8 @@ rapids-mamba-retry install \
   "pylibraft=${RAPIDS_VERSION}" \
   "raft-dask=${RAPIDS_VERSION}"
 
-export RAPIDS_DOCS_DIR="$(mktemp -d)"
+RAPIDS_DOCS_DIR="$(mktemp -d)"
+export RAPIDS_DOCS_DIR
 
 rapids-logger "Build CPP docs"
 pushd cpp/doxygen
diff --git a/ci/build_python.sh b/ci/build_python.sh
index 7da665075f..131f99c212 100755
--- a/ci/build_python.sh
+++ b/ci/build_python.sh
@@ -18,7 +18,6 @@ rapids-logger "Begin py build"
 CPP_CHANNEL=$(rapids-download-conda-from-s3 cpp)
 
 version=$(rapids-generate-version)
-git_commit=$(git rev-parse HEAD)
 export RAPIDS_PACKAGE_VERSION=${version}
 echo "${version}" > VERSION
 
diff --git a/ci/build_wheel.sh b/ci/build_wheel.sh
index e2e8919b95..845ac128f2 100755
--- a/ci/build_wheel.sh
+++ b/ci/build_wheel.sh
@@ -15,7 +15,7 @@ rm -rf /usr/lib64/libuc*
 source rapids-configure-sccache
 source rapids-date-string
 
-RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
+RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen "${RAPIDS_CUDA_VERSION}")"
 
 rapids-generate-version > ./VERSION
 
@@ -53,4 +53,4 @@ sccache --show-adv-stats
 mkdir -p final_dist
 python -m auditwheel repair -w final_dist "${EXCLUDE_ARGS[@]}" dist/*
 
-RAPIDS_PY_WHEEL_NAME="${underscore_package_name}_${RAPIDS_PY_CUDA_SUFFIX}" rapids-upload-wheels-to-s3 ${package_type} final_dist
+RAPIDS_PY_WHEEL_NAME="${underscore_package_name}_${RAPIDS_PY_CUDA_SUFFIX}" rapids-upload-wheels-to-s3 "${package_type}" final_dist
diff --git a/ci/build_wheel_libraft.sh b/ci/build_wheel_libraft.sh
index 10c69e1601..4468da37cd 100755
--- a/ci/build_wheel_libraft.sh
+++ b/ci/build_wheel_libraft.sh
@@ -26,7 +26,5 @@ rapids-pip-retry install \
 # 0 really means "add --no-build-isolation" (ref: https://github.com/pypa/pip/issues/5735)
 export PIP_NO_BUILD_ISOLATION=0
 
-RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
-
 ci/build_wheel.sh libraft ${package_dir} cpp
-ci/validate_wheel.sh ${package_dir} final_dist libraft
+ci/validate_wheel.sh ${package_dir} final_dist
diff --git a/ci/build_wheel_pylibraft.sh b/ci/build_wheel_pylibraft.sh
index 6f74e0e8c5..aed58446d2 100755
--- a/ci/build_wheel_pylibraft.sh
+++ b/ci/build_wheel_pylibraft.sh
@@ -5,7 +5,7 @@ set -euo pipefail
 
 package_dir="python/pylibraft"
 
-RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
+RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen "${RAPIDS_CUDA_VERSION}")"
 
 # Downloads libraft wheels from this current build,
 # then ensures 'pylibraft' wheel builds always use the 'libraft' just built in the same CI run.
@@ -17,4 +17,4 @@ echo "libraft-${RAPIDS_PY_CUDA_SUFFIX} @ file://$(echo /tmp/libraft_dist/libraft
 export PIP_CONSTRAINT="/tmp/constraints.txt"
 
 ci/build_wheel.sh pylibraft ${package_dir} python
-ci/validate_wheel.sh ${package_dir} final_dist pylibraft
+ci/validate_wheel.sh ${package_dir} final_dist
diff --git a/ci/build_wheel_raft_dask.sh b/ci/build_wheel_raft_dask.sh
index 0cacb6fe30..8241f7aff2 100755
--- a/ci/build_wheel_raft_dask.sh
+++ b/ci/build_wheel_raft_dask.sh
@@ -5,7 +5,7 @@ set -euo pipefail
 
 package_dir="python/raft-dask"
 
-RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
+RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen "${RAPIDS_CUDA_VERSION}")"
 
 # Downloads libraft wheels from this current build,
 # then ensures 'raft-dask' wheel builds always use the 'libraft' just built in the same CI run.
@@ -17,4 +17,4 @@ echo "libraft-${RAPIDS_PY_CUDA_SUFFIX} @ file://$(echo /tmp/libraft_dist/libraft
 export PIP_CONSTRAINT="/tmp/constraints.txt"
 
 ci/build_wheel.sh raft-dask ${package_dir} python
-ci/validate_wheel.sh ${package_dir} final_dist raft-dask
+ci/validate_wheel.sh ${package_dir} final_dist
diff --git a/ci/check_style.sh b/ci/check_style.sh
index e0c30a2d41..3505035af8 100755
--- a/ci/check_style.sh
+++ b/ci/check_style.sh
@@ -18,8 +18,8 @@ conda activate checks
 RAPIDS_VERSION_MAJOR_MINOR="$(rapids-version-major-minor)"
 FORMAT_FILE_URL="https://raw.githubusercontent.com/rapidsai/rapids-cmake/branch-${RAPIDS_VERSION_MAJOR_MINOR}/cmake-format-rapids-cmake.json"
 export RAPIDS_CMAKE_FORMAT_FILE=/tmp/rapids_cmake_ci/cmake-formats-rapids-cmake.json
-mkdir -p $(dirname ${RAPIDS_CMAKE_FORMAT_FILE})
-wget -O ${RAPIDS_CMAKE_FORMAT_FILE} ${FORMAT_FILE_URL}
+mkdir -p "$(dirname ${RAPIDS_CMAKE_FORMAT_FILE})"
+wget -O ${RAPIDS_CMAKE_FORMAT_FILE} "${FORMAT_FILE_URL}"
 
 # Run pre-commit checks
 pre-commit run --all-files --show-diff-on-failure
diff --git a/ci/checks/black_lists.sh b/ci/checks/black_lists.sh
index cf289c120c..df43b17b1b 100755
--- a/ci/checks/black_lists.sh
+++ b/ci/checks/black_lists.sh
@@ -6,7 +6,7 @@
 
 # PR_TARGET_BRANCH is set by the CI environment
 
-git checkout --quiet $PR_TARGET_BRANCH
+git checkout --quiet "$PR_TARGET_BRANCH"
 
 # Switch back to tip of PR branch
 git checkout --quiet current-pr-branch
@@ -20,16 +20,16 @@ set +H
 RETVAL=0
 
 for black_listed in cudaDeviceSynchronize cudaMalloc cudaMallocManaged cudaFree cudaMallocHost cudaHostAlloc cudaFreeHost; do
-    TMP=`git --no-pager diff --ignore-submodules -w --minimal -U0 -S"$black_listed" $PR_TARGET_BRANCH | grep '^+' | grep -v '^+++' | grep "$black_listed"`
+    TMP=$(git --no-pager diff --ignore-submodules -w --minimal -U0 -S"$black_listed" "$PR_TARGET_BRANCH" | grep '^+' | grep -v '^+++' | grep "$black_listed")
     if [ "$TMP" != "" ]; then
-        for filename in `git --no-pager diff --ignore-submodules -w --minimal --name-only -S"$black_listed" $PR_TARGET_BRANCH`; do
+        for filename in $(git --no-pager diff --ignore-submodules -w --minimal --name-only -S"$black_listed" "$PR_TARGET_BRANCH"); do
             basefilename=$(basename -- "$filename")
             filext="${basefilename##*.}"
             if [ "$filext" != "md" ] && [ "$filext" != "sh" ]; then
-                TMP2=`git --no-pager diff --ignore-submodules -w --minimal -U0 -S"$black_listed" $PR_TARGET_BRANCH -- $filename | grep '^+' | grep -v '^+++' | grep "$black_listed" | grep -vE "^\+[[:space:]]*/{2,}.*$black_listed"`
+                TMP2=$(git --no-pager diff --ignore-submodules -w --minimal -U0 -S"$black_listed" "$PR_TARGET_BRANCH" -- "$filename" | grep '^+' | grep -v '^+++' | grep "$black_listed" | grep -vE "^\+[[:space:]]*/{2,}.*$black_listed")
                 if [ "$TMP2" != "" ]; then
                     echo "=== ERROR: black listed function call $black_listed added to $filename ==="
-                    git --no-pager diff --ignore-submodules -w --minimal -S"$black_listed" $PR_TARGET_BRANCH -- $filename
+                    git --no-pager diff --ignore-submodules -w --minimal -S"$black_listed" "$PR_TARGET_BRANCH" -- "$filename"
                     echo "=== END ERROR ==="
                     RETVAL=1
                 fi
@@ -39,17 +39,17 @@ for black_listed in cudaDeviceSynchronize cudaMalloc cudaMallocManaged cudaFree
 done
 
 for cond_black_listed in cudaMemcpy cudaMemset; do
-    TMP=`git --no-pager diff --ignore-submodules -w --minimal -U0 -S"$cond_black_listed" $PR_TARGET_BRANCH | grep '^+' | grep -v '^+++' | grep -P "$cond_black_listed(?!Async)"`
+    TMP=$(git --no-pager diff --ignore-submodules -w --minimal -U0 -S"$cond_black_listed" "$PR_TARGET_BRANCH" | grep '^+' | grep -v '^+++' | grep -P "$cond_black_listed(?!Async)")
 
     if [ "$TMP" != "" ]; then
-        for filename in `git --no-pager diff --ignore-submodules -w --minimal --name-only -S"$cond_black_listed" $PR_TARGET_BRANCH`; do
+        for filename in $(git --no-pager diff --ignore-submodules -w --minimal --name-only -S"$cond_black_listed" "$PR_TARGET_BRANCH"); do
             basefilename=$(basename -- "$filename")
             filext="${basefilename##*.}"
             if [ "$filext" != "md" ] && [ "$filext" != "sh" ]; then
-                TMP2=`git --no-pager diff --ignore-submodules -w --minimal -U0 -S"$cond_black_listed" $PR_TARGET_BRANCH -- $filename | grep '^+' | grep -v '^+++' | grep -P "$cond_black_listed(?!Async)" | grep -vE "^\+[[:space:]]*/{2,}.*$cond_black_listed"`
+                TMP2=$(git --no-pager diff --ignore-submodules -w --minimal -U0 -S"$cond_black_listed" "$PR_TARGET_BRANCH" -- "$filename" | grep '^+' | grep -v '^+++' | grep -P "$cond_black_listed(?!Async)" | grep -vE "^\+[[:space:]]*/{2,}.*$cond_black_listed")
                 if [ "$TMP2" != "" ]; then
                     echo "=== ERROR: black listed function call $cond_black_listed added to $filename ==="
-                    git --no-pager diff --ignore-submodules -w --minimal -S"$cond_black_listed" $PR_TARGET_BRANCH -- $filename
+                    git --no-pager diff --ignore-submodules -w --minimal -S"$cond_black_listed" "$PR_TARGET_BRANCH" -- "$filename"
                     echo "=== END ERROR ==="
                     RETVAL=1
                 fi
diff --git a/ci/release/update-version.sh b/ci/release/update-version.sh
index 1ab9157b89..244f66e99a 100755
--- a/ci/release/update-version.sh
+++ b/ci/release/update-version.sh
@@ -13,16 +13,15 @@ NEXT_FULL_TAG=$1
 
 # Get current version
 CURRENT_TAG=$(git tag --merged HEAD | grep -xE '^v.*' | sort --version-sort | tail -n 1 | tr -d 'v')
-CURRENT_MAJOR=$(echo $CURRENT_TAG | awk '{split($0, a, "."); print a[1]}')
-CURRENT_MINOR=$(echo $CURRENT_TAG | awk '{split($0, a, "."); print a[2]}')
-CURRENT_PATCH=$(echo $CURRENT_TAG | awk '{split($0, a, "."); print a[3]}')
+CURRENT_MAJOR=$(echo "$CURRENT_TAG" | awk '{split($0, a, "."); print a[1]}')
+CURRENT_MINOR=$(echo "$CURRENT_TAG" | awk '{split($0, a, "."); print a[2]}')
 CURRENT_SHORT_TAG=${CURRENT_MAJOR}.${CURRENT_MINOR}
 
 # Get <major>.<minor> for next version
-NEXT_MAJOR=$(echo $NEXT_FULL_TAG | awk '{split($0, a, "."); print a[1]}')
-NEXT_MINOR=$(echo $NEXT_FULL_TAG | awk '{split($0, a, "."); print a[2]}')
+NEXT_MAJOR=$(echo "$NEXT_FULL_TAG" | awk '{split($0, a, "."); print a[1]}')
+NEXT_MINOR=$(echo "$NEXT_FULL_TAG" | awk '{split($0, a, "."); print a[2]}')
 NEXT_SHORT_TAG=${NEXT_MAJOR}.${NEXT_MINOR}
-NEXT_UCXX_SHORT_TAG="$(curl -sL https://version.gpuci.io/rapids/${NEXT_SHORT_TAG})"
+NEXT_UCXX_SHORT_TAG="$(curl -sL https://version.gpuci.io/rapids/"${NEXT_SHORT_TAG}")"
 
 # Need to distutils-normalize the original version
 NEXT_SHORT_TAG_PEP440=$(python -c "from packaging.version import Version; print(Version('${NEXT_SHORT_TAG}'))")
@@ -32,7 +31,7 @@ echo "Preparing release $CURRENT_TAG => $NEXT_FULL_TAG"
 
 # Inplace sed replace; workaround for Linux and Mac
 function sed_runner() {
-    sed -i.bak ''"$1"'' $2 && rm -f ${2}.bak
+    sed -i.bak ''"$1"'' "$2" && rm -f "${2}".bak
 }
 
 sed_runner 's/'"find_and_configure_ucxx(VERSION .*"'/'"find_and_configure_ucxx(VERSION  ${NEXT_UCXX_SHORT_TAG_PEP440}"'/g' python/raft-dask/cmake/thirdparty/get_ucxx.cmake
diff --git a/ci/test_cpp.sh b/ci/test_cpp.sh
index 9d0edc6b21..64400858ec 100755
--- a/ci/test_cpp.sh
+++ b/ci/test_cpp.sh
@@ -43,4 +43,4 @@ export GTEST_OUTPUT=xml:${RAPIDS_TESTS_DIR}/
 ./ci/run_ctests.sh -j8 && EXITCODE=$? || EXITCODE=$?;
 
 rapids-logger "Test script exiting with value: $EXITCODE"
-exit ${EXITCODE}
+exit "${EXITCODE}"
diff --git a/ci/test_wheel_pylibraft.sh b/ci/test_wheel_pylibraft.sh
index 0321e41bfb..aba0614767 100755
--- a/ci/test_wheel_pylibraft.sh
+++ b/ci/test_wheel_pylibraft.sh
@@ -4,7 +4,7 @@
 set -euo pipefail
 
 mkdir -p ./dist
-RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
+RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen "${RAPIDS_CUDA_VERSION}")"
 RAPIDS_PY_WHEEL_NAME="libraft_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 cpp ./local-libraft-dep
 RAPIDS_PY_WHEEL_NAME="pylibraft_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 python ./dist
 
diff --git a/ci/test_wheel_raft_dask.sh b/ci/test_wheel_raft_dask.sh
index da3b40b353..e38b278d05 100755
--- a/ci/test_wheel_raft_dask.sh
+++ b/ci/test_wheel_raft_dask.sh
@@ -4,7 +4,7 @@
 set -euo pipefail
 
 mkdir -p ./dist
-RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
+RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen "${RAPIDS_CUDA_VERSION}")"
 RAPIDS_PY_WHEEL_NAME="libraft_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 cpp ./local-libraft-dep
 RAPIDS_PY_WHEEL_NAME="pylibraft_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 python ./local-pylibraft-dep
 RAPIDS_PY_WHEEL_NAME="raft_dask_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 python ./dist
@@ -13,7 +13,7 @@ RAPIDS_PY_WHEEL_NAME="raft_dask_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels
 rapids-pip-retry install -v \
     ./local-libraft-dep/libraft*.whl \
     ./local-pylibraft-dep/pylibraft*.whl \
-    "$(echo ./dist/raft_dask_${RAPIDS_PY_CUDA_SUFFIX}*.whl)[test]"
+    "$(echo ./dist/raft_dask_"${RAPIDS_PY_CUDA_SUFFIX}"*.whl)[test]"
 
 test_dir="python/raft-dask/raft_dask/tests"
 
diff --git a/ci/validate_wheel.sh b/ci/validate_wheel.sh
index ec3867aa30..5c2c5f2cf6 100755
--- a/ci/validate_wheel.sh
+++ b/ci/validate_wheel.sh
@@ -5,9 +5,7 @@ set -euo pipefail
 
 package_dir=$1
 wheel_dir_relative_path=$2
-package_name=$3
 
-RAPIDS_CUDA_MAJOR="${RAPIDS_CUDA_VERSION%%.*}"
 
 cd "${package_dir}"
 
@@ -15,10 +13,10 @@ rapids-logger "validate packages with 'pydistcheck'"
 
 pydistcheck \
     --inspect \
-    "$(echo ${wheel_dir_relative_path}/*.whl)"
+    "$(echo "${wheel_dir_relative_path}"/*.whl)"
 
 rapids-logger "validate packages with 'twine'"
 
 twine check \
     --strict \
-    "$(echo ${wheel_dir_relative_path}/*.whl)"
+    "$(echo "${wheel_dir_relative_path}"/*.whl)"

From 1795caaa5305bd6cbaa96f742fc97890cc84534f Mon Sep 17 00:00:00 2001
From: Vyas Ramasubramani <vyasr@nvidia.com>
Date: Mon, 10 Feb 2025 20:54:23 -0800
Subject: [PATCH 07/11] Use new rapids-logger library (#2566)

Contributes to https://github.com/rapidsai/build-planning/issues/104.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Divye Gala (https://github.com/divyegala)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Bradley Dice (https://github.com/bdice)
  - James Lamb (https://github.com/jameslamb)

URL: https://github.com/rapidsai/raft/pull/2566
---
 ci/build_cpp.sh                               |   3 +-
 ci/build_wheel.sh                             |   1 +
 .../all_cuda-118_arch-aarch64.yaml            |   2 +-
 .../all_cuda-118_arch-x86_64.yaml             |   2 +-
 .../all_cuda-128_arch-aarch64.yaml            |   2 +-
 .../all_cuda-128_arch-x86_64.yaml             |   2 +-
 conda/recipes/libraft/conda_build_config.yaml |   6 -
 conda/recipes/libraft/meta.yaml               |   4 +-
 cpp/CMakeLists.txt                            |  23 ++--
 cpp/bench/prims/CMakeLists.txt                |   4 -
 cpp/cmake/thirdparty/get_spdlog.cmake         |  24 ----
 .../raft/cluster/detail/kmeans_balanced.cuh   |   1 -
 cpp/include/raft/cluster/kmeans_types.hpp     |   5 +-
 cpp/include/raft/core/logger-macros.hpp       |  31 -----
 cpp/include/raft/core/logger.hpp              |  78 ++++++++++++
 .../raft/neighbors/detail/ivf_flat_build.cuh  |   1 -
 .../neighbors/detail/ivf_flat_search-inl.cuh  |   1 -
 cpp/tests/CMakeLists.txt                      |   6 -
 cpp/tests/core/device_resources_manager.cpp   |   3 +-
 cpp/tests/core/logger.cpp                     | 118 ------------------
 dependencies.yaml                             |  14 ++-
 python/libraft/libraft/load.py                |   9 ++
 python/libraft/pyproject.toml                 |   3 +
 23 files changed, 126 insertions(+), 217 deletions(-)
 delete mode 100644 cpp/cmake/thirdparty/get_spdlog.cmake
 delete mode 100644 cpp/include/raft/core/logger-macros.hpp
 create mode 100644 cpp/include/raft/core/logger.hpp
 delete mode 100644 cpp/tests/core/logger.cpp

diff --git a/ci/build_cpp.sh b/ci/build_cpp.sh
index 92586c7c0a..06bd2901b2 100755
--- a/ci/build_cpp.sh
+++ b/ci/build_cpp.sh
@@ -17,7 +17,8 @@ rapids-logger "Begin cpp build"
 
 sccache --zero-stats
 
-RAPIDS_PACKAGE_VERSION=$(rapids-generate-version) rapids-conda-retry mambabuild conda/recipes/libraft
+RAPIDS_PACKAGE_VERSION=$(rapids-generate-version) rapids-conda-retry mambabuild \
+  conda/recipes/libraft
 
 sccache --show-adv-stats
 
diff --git a/ci/build_wheel.sh b/ci/build_wheel.sh
index 845ac128f2..74ddc11f4d 100755
--- a/ci/build_wheel.sh
+++ b/ci/build_wheel.sh
@@ -28,6 +28,7 @@ EXCLUDE_ARGS=(
   --exclude "libcusolver.so.*"
   --exclude "libcusparse.so.*"
   --exclude "libnvJitLink.so.*"
+  --exclude "librapids_logger.so"
   --exclude "libucp.so.*"
 )
 
diff --git a/conda/environments/all_cuda-118_arch-aarch64.yaml b/conda/environments/all_cuda-118_arch-aarch64.yaml
index af9534925b..2cf400550f 100644
--- a/conda/environments/all_cuda-118_arch-aarch64.yaml
+++ b/conda/environments/all_cuda-118_arch-aarch64.yaml
@@ -47,12 +47,12 @@ dependencies:
 - pytest==7.*
 - rapids-build-backend>=0.3.0,<0.4.0.dev0
 - rapids-dask-dependency==25.4.*,>=0.0.0a0
+- rapids-logger==0.1.*,>=0.0.0a0
 - recommonmark
 - rmm==25.4.*,>=0.0.0a0
 - scikit-build-core>=0.10.0
 - scikit-learn
 - scipy
-- spdlog>=1.14.1,<1.15
 - sphinx-copybutton
 - sphinx-markdown-tables
 - sysroot_linux-aarch64==2.28
diff --git a/conda/environments/all_cuda-118_arch-x86_64.yaml b/conda/environments/all_cuda-118_arch-x86_64.yaml
index 2b60e01fd0..9aa9e4ef62 100644
--- a/conda/environments/all_cuda-118_arch-x86_64.yaml
+++ b/conda/environments/all_cuda-118_arch-x86_64.yaml
@@ -47,12 +47,12 @@ dependencies:
 - pytest==7.*
 - rapids-build-backend>=0.3.0,<0.4.0.dev0
 - rapids-dask-dependency==25.4.*,>=0.0.0a0
+- rapids-logger==0.1.*,>=0.0.0a0
 - recommonmark
 - rmm==25.4.*,>=0.0.0a0
 - scikit-build-core>=0.10.0
 - scikit-learn
 - scipy
-- spdlog>=1.14.1,<1.15
 - sphinx-copybutton
 - sphinx-markdown-tables
 - sysroot_linux-64==2.28
diff --git a/conda/environments/all_cuda-128_arch-aarch64.yaml b/conda/environments/all_cuda-128_arch-aarch64.yaml
index 36e87bad27..f180b766b3 100644
--- a/conda/environments/all_cuda-128_arch-aarch64.yaml
+++ b/conda/environments/all_cuda-128_arch-aarch64.yaml
@@ -43,12 +43,12 @@ dependencies:
 - pytest==7.*
 - rapids-build-backend>=0.3.0,<0.4.0.dev0
 - rapids-dask-dependency==25.4.*,>=0.0.0a0
+- rapids-logger==0.1.*,>=0.0.0a0
 - recommonmark
 - rmm==25.4.*,>=0.0.0a0
 - scikit-build-core>=0.10.0
 - scikit-learn
 - scipy
-- spdlog>=1.14.1,<1.15
 - sphinx-copybutton
 - sphinx-markdown-tables
 - sysroot_linux-aarch64==2.28
diff --git a/conda/environments/all_cuda-128_arch-x86_64.yaml b/conda/environments/all_cuda-128_arch-x86_64.yaml
index 0843620f32..a098636ea6 100644
--- a/conda/environments/all_cuda-128_arch-x86_64.yaml
+++ b/conda/environments/all_cuda-128_arch-x86_64.yaml
@@ -43,12 +43,12 @@ dependencies:
 - pytest==7.*
 - rapids-build-backend>=0.3.0,<0.4.0.dev0
 - rapids-dask-dependency==25.4.*,>=0.0.0a0
+- rapids-logger==0.1.*,>=0.0.0a0
 - recommonmark
 - rmm==25.4.*,>=0.0.0a0
 - scikit-build-core>=0.10.0
 - scikit-learn
 - scipy
-- spdlog>=1.14.1,<1.15
 - sphinx-copybutton
 - sphinx-markdown-tables
 - sysroot_linux-64==2.28
diff --git a/conda/recipes/libraft/conda_build_config.yaml b/conda/recipes/libraft/conda_build_config.yaml
index 11b16bc2a8..1386116d81 100644
--- a/conda/recipes/libraft/conda_build_config.yaml
+++ b/conda/recipes/libraft/conda_build_config.yaml
@@ -56,9 +56,3 @@ cuda11_cuda_profiler_api_host_version:
 
 cuda11_cuda_profiler_api_run_version:
   - ">=11.4.240,<12"
-
-spdlog_version:
-  - ">=1.14.1,<1.15"
-
-fmt_version:
-  - ">=11.0.2,<12"
diff --git a/conda/recipes/libraft/meta.yaml b/conda/recipes/libraft/meta.yaml
index dbde4e3971..9faa7b84ee 100644
--- a/conda/recipes/libraft/meta.yaml
+++ b/conda/recipes/libraft/meta.yaml
@@ -63,6 +63,7 @@ outputs:
         - cuda-cudart-dev
         {% endif %}
         - librmm ={{ minor_version }}
+        - rapids-logger =0.1
       run:
         - {{ pin_compatible('cuda-version', max_pin='x', min_pin='x') }}
         {% if cuda_major == "11" %}
@@ -71,8 +72,7 @@ outputs:
         - cuda-cudart
         {% endif %}
         - librmm ={{ minor_version }}
-        - spdlog {{ spdlog_version }}
-        - fmt {{ fmt_version }}
+        - rapids-logger =0.1
     about:
       home: https://rapids.ai/
       license: Apache-2.0
diff --git a/cpp/CMakeLists.txt b/cpp/CMakeLists.txt
index c38471bebd..436b120872 100644
--- a/cpp/CMakeLists.txt
+++ b/cpp/CMakeLists.txt
@@ -167,8 +167,8 @@ include(cmake/modules/ConfigureCUDA.cmake)
 rapids_cpm_init()
 
 include(${rapids-cmake-dir}/cpm/rapids_logger.cmake)
-rapids_cpm_rapids_logger()
-rapids_make_logger(raft LOGGER_HEADER_DIR include/raft/core EXPORT_SET raft-exports)
+rapids_cpm_rapids_logger(BUILD_EXPORT_SET raft-exports INSTALL_EXPORT_SET raft-exports)
+create_logger_macros(RAFT "raft::default_logger()" include/raft/core)
 
 # CCCL before rmm/cuco so we get the right version of CCCL
 include(cmake/thirdparty/get_cccl.cmake)
@@ -194,13 +194,14 @@ add_library(raft INTERFACE)
 add_library(raft::raft ALIAS raft)
 
 target_include_directories(
-  raft INTERFACE "$<BUILD_INTERFACE:${RAFT_SOURCE_DIR}/include>" "$<INSTALL_INTERFACE:include>"
+  raft INTERFACE "$<BUILD_INTERFACE:${RAFT_SOURCE_DIR}/include>"
+                 "$<BUILD_INTERFACE:${RAFT_BINARY_DIR}/include>" "$<INSTALL_INTERFACE:include>"
 )
 
 # Keep RAFT as lightweight as possible. Only CUDA libs and rmm should be used in global target.
 target_link_libraries(
-  raft INTERFACE rmm::rmm rmm::rmm_logger spdlog::spdlog_header_only cuco::cuco
-                 nvidia::cutlass::cutlass CCCL::CCCL raft_logger
+  raft INTERFACE rapids_logger::rapids_logger rmm::rmm cuco::cuco nvidia::cutlass::cutlass
+                 CCCL::CCCL
 )
 
 target_compile_features(raft INTERFACE cxx_std_17 $<BUILD_INTERFACE:cuda_std_17>)
@@ -209,7 +210,7 @@ target_compile_options(
                  --expt-relaxed-constexpr>
 )
 target_compile_definitions(
-  raft INTERFACE "RAFT_LOG_ACTIVE_LEVEL=RAFT_LOG_LEVEL_${LIBRAFT_LOGGING_LEVEL}"
+  raft INTERFACE "RAFT_LOG_ACTIVE_LEVEL=RAPIDS_LOGGER_LOG_LEVEL_${LIBRAFT_LOGGING_LEVEL}"
 )
 
 set(RAFT_CUSOLVER_DEPENDENCY CUDA::cusolver${_ctk_static_suffix})
@@ -311,13 +312,11 @@ if(RAFT_COMPILE_LIBRARY)
                       "$<$<COMPILE_LANGUAGE:CUDA>:${RAFT_CUDA_FLAGS}>"
   )
 
-  # Make sure not to add the rmm logger twice since it will be brought in as an interface source by
-  # the rmm::rmm_logger_impl target.
-  add_library(raft_lib SHARED $<FILTER:$<TARGET_OBJECTS:raft_objs>,EXCLUDE,rmm.*logger>)
+  add_library(raft_lib SHARED $<TARGET_OBJECTS:raft_objs>)
 
   set(_raft_lib_targets raft_lib)
   if(NOT RAFT_COMPILE_DYNAMIC_ONLY)
-    add_library(raft_lib_static STATIC $<FILTER:$<TARGET_OBJECTS:raft_objs>,EXCLUDE,rmm.*logger>)
+    add_library(raft_lib_static STATIC $<TARGET_OBJECTS:raft_objs>)
     list(APPEND _raft_lib_targets raft_lib_static)
   endif()
 
@@ -344,10 +343,6 @@ if(RAFT_COMPILE_LIBRARY)
     # ensure CUDA symbols aren't relocated to the middle of the debug build binaries
     target_link_options(${target} PRIVATE "${CMAKE_CURRENT_BINARY_DIR}/fatbin.ld")
   endforeach()
-  target_link_libraries(raft_lib PRIVATE rmm::rmm_logger_impl raft_logger_impl)
-  if(NOT RAFT_COMPILE_DYNAMIC_ONLY)
-    target_link_libraries(raft_lib_static PRIVATE rmm::rmm_logger_impl raft_logger_impl)
-  endif()
 endif()
 
 if(TARGET raft_lib AND (NOT TARGET raft::raft_lib))
diff --git a/cpp/bench/prims/CMakeLists.txt b/cpp/bench/prims/CMakeLists.txt
index edc1af4e02..cf03a36612 100644
--- a/cpp/bench/prims/CMakeLists.txt
+++ b/cpp/bench/prims/CMakeLists.txt
@@ -32,7 +32,6 @@ function(ConfigureBench)
     PRIVATE raft::raft
             raft_internal
             $<$<BOOL:${ConfigureBench_LIB}>:raft::compiled>
-            $<$<NOT:$<BOOL:${ConfigureBench_LIB}>>:bench_rmm_logger>
             ${RAFT_CTK_MATH_DEPENDENCIES}
             benchmark::benchmark
             Threads::Threads
@@ -74,9 +73,6 @@ function(ConfigureBench)
 
 endfunction()
 
-add_library(bench_rmm_logger OBJECT)
-target_link_libraries(bench_rmm_logger PRIVATE rmm::rmm_logger_impl)
-
 if(BUILD_PRIMS_BENCH)
   ConfigureBench(NAME CORE_BENCH PATH core/bitset.cu core/copy.cu main.cpp)
 
diff --git a/cpp/cmake/thirdparty/get_spdlog.cmake b/cpp/cmake/thirdparty/get_spdlog.cmake
deleted file mode 100644
index b1ffbe246f..0000000000
--- a/cpp/cmake/thirdparty/get_spdlog.cmake
+++ /dev/null
@@ -1,24 +0,0 @@
-# =============================================================================
-# Copyright (c) 2021-2024, NVIDIA CORPORATION.
-#
-# Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
-# in compliance with the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software distributed under the License
-# is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
-# or implied. See the License for the specific language governing permissions and limitations under
-# the License.
-# =============================================================================
-
-# Use CPM to find or clone speedlog
-function(find_and_configure_spdlog)
-
-    include(${rapids-cmake-dir}/cpm/spdlog.cmake)
-    rapids_cpm_spdlog(FMT_OPTION "EXTERNAL_FMT_HO" INSTALL_EXPORT_SET raft-exports)
-    rapids_export_package(BUILD spdlog raft-exports)
-
-endfunction()
-
-find_and_configure_spdlog()
diff --git a/cpp/include/raft/cluster/detail/kmeans_balanced.cuh b/cpp/include/raft/cluster/detail/kmeans_balanced.cuh
index 5dcd679bd5..0a5a3ba5aa 100644
--- a/cpp/include/raft/cluster/detail/kmeans_balanced.cuh
+++ b/cpp/include/raft/cluster/detail/kmeans_balanced.cuh
@@ -20,7 +20,6 @@
 #include <raft/cluster/kmeans_balanced_types.hpp>
 #include <raft/common/nvtx.hpp>
 #include <raft/core/cudart_utils.hpp>
-#include <raft/core/logger-macros.hpp>
 #include <raft/core/logger.hpp>
 #include <raft/core/operators.hpp>
 #include <raft/core/resource/cuda_stream.hpp>
diff --git a/cpp/include/raft/cluster/kmeans_types.hpp b/cpp/include/raft/cluster/kmeans_types.hpp
index fbedd58417..9603898a37 100644
--- a/cpp/include/raft/cluster/kmeans_types.hpp
+++ b/cpp/include/raft/cluster/kmeans_types.hpp
@@ -14,10 +14,11 @@
  * limitations under the License.
  */
 #pragma once
-#include <raft/core/logger.hpp>
 #include <raft/distance/distance_types.hpp>
 #include <raft/random/rng_state.hpp>
 
+#include <rapids_logger/logger.hpp>
+
 namespace raft::cluster {
 
 /** Base structure for parameters that are common to all k-means algorithms */
@@ -82,7 +83,7 @@ struct KMeansParams : kmeans_base_params {
   /**
    * verbosity level.
    */
-  level_enum verbosity = level_enum::info;
+  rapids_logger::level_enum verbosity = rapids_logger::level_enum::info;
 
   /**
    * Seed to the random number generator.
diff --git a/cpp/include/raft/core/logger-macros.hpp b/cpp/include/raft/core/logger-macros.hpp
deleted file mode 100644
index e32440dcce..0000000000
--- a/cpp/include/raft/core/logger-macros.hpp
+++ /dev/null
@@ -1,31 +0,0 @@
-/*
- * Copyright (c) 2022-2023, NVIDIA CORPORATION.
- *
- * Licensed under the Apache License, Version 2.0 (the "License");
- * you may not use this file except in compliance with the License.
- * You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-#pragma once
-
-#include <sstream>
-
-#if (RAFT_LOG_ACTIVE_LEVEL <= RAFT_LOG_LEVEL_TRACE)
-#define RAFT_LOG_TRACE_VEC(ptr, len)                                               \
-  do {                                                                             \
-    std::stringstream ss;                                                          \
-    ss << raft::detail::format("%s:%d ", __FILE__, __LINE__);                      \
-    print_vector(#ptr, ptr, len, ss);                                              \
-    raft::logger::get(RAFT_NAME).log(RAFT_LEVEL_TRACE, ss.str().c_str());          \
-    RAFT_LOGGER_CALL(raft::default_logger(), raft::level_enum::trace, __VA_ARGS__) \
-  } while (0)
-#else
-#define RAFT_LOG_TRACE_VEC(ptr, len) void(0)
-#endif
diff --git a/cpp/include/raft/core/logger.hpp b/cpp/include/raft/core/logger.hpp
new file mode 100644
index 0000000000..b7f838d359
--- /dev/null
+++ b/cpp/include/raft/core/logger.hpp
@@ -0,0 +1,78 @@
+/*
+ * Copyright (c) 2025, NVIDIA CORPORATION.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#pragma once
+
+#include <raft/core/logger_macros.hpp>
+
+#include <rapids_logger/logger.hpp>
+
+#include <sstream>
+
+namespace raft {
+
+/**
+ * @brief Returns the default sink for the global logger.
+ *
+ * If the environment variable `RAFT_DEBUG_LOG_FILE` is defined, the default sink is a sink to that
+ * file. Otherwise, the default is to dump to stderr.
+ *
+ * @return sink_ptr The sink to use
+ */
+inline rapids_logger::sink_ptr default_sink()
+{
+  auto* filename = std::getenv("RAFT_DEBUG_LOG_FILE");
+  if (filename != nullptr) {
+    return std::make_shared<rapids_logger::basic_file_sink_mt>(filename, true);
+  }
+  return std::make_shared<rapids_logger::stderr_sink_mt>();
+}
+
+/**
+ * @brief Returns the default log pattern for the global logger.
+ *
+ * @return std::string The default log pattern.
+ */
+inline std::string default_pattern() { return "[%6t][%H:%M:%S:%f][%-6l] %v"; }
+
+/**
+ * @brief Get the default logger.
+ *
+ * @return logger& The default logger
+ */
+inline rapids_logger::logger& default_logger()
+{
+  static rapids_logger::logger logger_ = [] {
+    rapids_logger::logger logger_{"RAFT", {default_sink()}};
+    logger_.set_pattern(default_pattern());
+    return logger_;
+  }();
+  return logger_;
+}
+
+}  // namespace raft
+
+#if (RAFT_LOG_ACTIVE_LEVEL <= RAPIDS_LOGGER_LOG_LEVEL_TRACE)
+#define RAFT_LOG_TRACE_VEC(ptr, len)                                             \
+  do {                                                                           \
+    std::stringstream ss;                                                        \
+    ss << raft::detail::format("%s:%d ", __FILE__, __LINE__);                    \
+    print_vector(#ptr, ptr, len, ss);                                            \
+    raft::default_logger().log(RAPIDS_LOGGER_LOG_LEVEL_TRACE, ss.str().c_str()); \
+  } while (0)
+#else
+#define RAFT_LOG_TRACE_VEC(ptr, len) void(0)
+#endif
diff --git a/cpp/include/raft/neighbors/detail/ivf_flat_build.cuh b/cpp/include/raft/neighbors/detail/ivf_flat_build.cuh
index 0e00ef571f..55184cc615 100644
--- a/cpp/include/raft/neighbors/detail/ivf_flat_build.cuh
+++ b/cpp/include/raft/neighbors/detail/ivf_flat_build.cuh
@@ -17,7 +17,6 @@
 #pragma once
 
 #include <raft/cluster/kmeans_balanced.cuh>
-#include <raft/core/logger-macros.hpp>
 #include <raft/core/logger.hpp>
 #include <raft/core/mdarray.hpp>
 #include <raft/core/nvtx.hpp>
diff --git a/cpp/include/raft/neighbors/detail/ivf_flat_search-inl.cuh b/cpp/include/raft/neighbors/detail/ivf_flat_search-inl.cuh
index 44d55c36de..9d30ef1ed5 100644
--- a/cpp/include/raft/neighbors/detail/ivf_flat_search-inl.cuh
+++ b/cpp/include/raft/neighbors/detail/ivf_flat_search-inl.cuh
@@ -16,7 +16,6 @@
 
 #pragma once
 
-#include <raft/core/logger-macros.hpp>
 #include <raft/core/logger.hpp>
 #include <raft/core/resource/cuda_stream.hpp>
 #include <raft/core/resources.hpp>                              // raft::resources
diff --git a/cpp/tests/CMakeLists.txt b/cpp/tests/CMakeLists.txt
index a1e699376e..34bea67fbe 100644
--- a/cpp/tests/CMakeLists.txt
+++ b/cpp/tests/CMakeLists.txt
@@ -55,7 +55,6 @@ function(ConfigureTest)
             ${RAFT_CTK_MATH_DEPENDENCIES}
             $<TARGET_NAME_IF_EXISTS:OpenMP::OpenMP_CXX>
             $<TARGET_NAME_IF_EXISTS:conda_env>
-            raft_test_logger
   )
   set_target_properties(
     ${TEST_NAME}
@@ -88,10 +87,6 @@ function(ConfigureTest)
   )
 endfunction()
 
-# Create an object library for the logger so that we don't have to recompile it.
-add_library(raft_test_logger OBJECT)
-target_link_libraries(raft_test_logger PRIVATE raft_logger_impl)
-
 # ##################################################################################################
 # test sources ##################################################################################
 # ##################################################################################################
@@ -104,7 +99,6 @@ if(BUILD_TESTS)
     core/bitset.cu
     core/device_resources_manager.cpp
     core/device_setter.cpp
-    core/logger.cpp
     core/math_device.cu
     core/math_host.cpp
     core/operators_device.cu
diff --git a/cpp/tests/core/device_resources_manager.cpp b/cpp/tests/core/device_resources_manager.cpp
index 007b57378f..dde02c64c6 100644
--- a/cpp/tests/core/device_resources_manager.cpp
+++ b/cpp/tests/core/device_resources_manager.cpp
@@ -89,7 +89,8 @@ TEST(DeviceResourcesManager, ObeysSetters)
 
   // Suppress the many warnings from testing use of setters after initial
   // get_device_resources call
-  auto scoped_log_level = log_level_setter{level_enum::error};
+  auto scoped_log_level =
+    rapids_logger::log_level_setter{default_logger(), rapids_logger::level_enum::error};
 
   omp_set_dynamic(0);
 #pragma omp parallel for num_threads(5)
diff --git a/cpp/tests/core/logger.cpp b/cpp/tests/core/logger.cpp
deleted file mode 100644
index 10adb71dda..0000000000
--- a/cpp/tests/core/logger.cpp
+++ /dev/null
@@ -1,118 +0,0 @@
-/*
- * Copyright (c) 2020-2024, NVIDIA CORPORATION.
- *
- * Licensed under the Apache License, Version 2.0 (the "License");
- * you may not use this file except in compliance with the License.
- * You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-// We set RAFT_LOG_ACTIVE_LEVEL to a value that would enable testing trace and debug logs
-// (otherwise trace and debug logs are desabled by default).
-#undef RAFT_LOG_ACTIVE_LEVEL
-#define RAFT_LOG_ACTIVE_LEVEL RAFT_LOG_LEVEL_TRACE
-
-#include <raft/core/logger.hpp>
-
-#include <gtest/gtest.h>
-
-#include <string>
-
-namespace raft {
-
-TEST(logger, Test)
-{
-  RAFT_LOG_CRITICAL("This is a critical message");
-  RAFT_LOG_ERROR("This is an error message");
-  RAFT_LOG_WARN("This is a warning message");
-  RAFT_LOG_INFO("This is an info message");
-
-  default_logger().set_level(raft::level_enum::warn);
-  ASSERT_EQ(raft::level_enum::warn, default_logger().level());
-  default_logger().set_level(raft::level_enum::info);
-  ASSERT_EQ(raft::level_enum::info, default_logger().level());
-
-  ASSERT_FALSE(default_logger().should_log(raft::level_enum::trace));
-  ASSERT_FALSE(default_logger().should_log(raft::level_enum::debug));
-  ASSERT_TRUE(default_logger().should_log(raft::level_enum::info));
-  ASSERT_TRUE(default_logger().should_log(raft::level_enum::warn));
-}
-
-std::string logged = "";
-void exampleCallback(int lvl, const char* msg) { logged = std::string(msg); }
-
-int flushCount = 0;
-void exampleFlush() { ++flushCount; }
-
-class loggerTest : public ::testing::Test {
- protected:
-  void SetUp() override
-  {
-    flushCount = 0;
-    logged     = "";
-    default_logger().set_level(raft::level_enum::trace);
-  }
-
-  void TearDown() override
-  {
-    default_logger().sinks().pop_back();
-    default_logger().set_level(raft::level_enum::info);
-  }
-};
-
-// The logging macros depend on `RAFT_LOG_ACTIVE_LEVEL` as well as the logger verbosity;
-// The verbosity is set to `RAFT_LOG_LEVEL_TRACE`, but `RAFT_LOG_ACTIVE_LEVEL` is set outside of
-// here.
-auto check_if_logged(const std::string& msg, raft::level_enum log_level_def) -> bool
-{
-  bool actually_logged  = logged.find(msg) != std::string::npos;
-  bool should_be_logged = RAFT_LOG_ACTIVE_LEVEL <= static_cast<int>(log_level_def);
-  return actually_logged == should_be_logged;
-}
-
-TEST_F(loggerTest, callback)
-{
-  std::string testMsg;
-  default_logger().sinks().push_back(std::make_shared<callback_sink_mt>(exampleCallback));
-
-  testMsg = "This is a critical message";
-  RAFT_LOG_CRITICAL(testMsg.c_str());
-  ASSERT_TRUE(check_if_logged(testMsg, raft::level_enum::critical));
-
-  testMsg = "This is an error message";
-  RAFT_LOG_ERROR(testMsg.c_str());
-  ASSERT_TRUE(check_if_logged(testMsg, raft::level_enum::error));
-
-  testMsg = "This is a warning message";
-  RAFT_LOG_WARN(testMsg.c_str());
-  ASSERT_TRUE(check_if_logged(testMsg, raft::level_enum::warn));
-
-  testMsg = "This is an info message";
-  RAFT_LOG_INFO(testMsg.c_str());
-  ASSERT_TRUE(check_if_logged(testMsg, raft::level_enum::info));
-
-  testMsg = "This is a debug message";
-  RAFT_LOG_DEBUG(testMsg.c_str());
-  ASSERT_TRUE(check_if_logged(testMsg, raft::level_enum::debug));
-
-  testMsg = "This is a trace message";
-  RAFT_LOG_TRACE(testMsg.c_str());
-  ASSERT_TRUE(check_if_logged(testMsg, raft::level_enum::trace));
-}
-
-TEST_F(loggerTest, flush)
-{
-  default_logger().sinks().push_back(
-    std::make_shared<callback_sink_mt>(exampleCallback, exampleFlush));
-  default_logger().flush();
-  ASSERT_EQ(1, flushCount);
-}
-
-}  // namespace raft
diff --git a/dependencies.yaml b/dependencies.yaml
index 71a69ecfae..225103391f 100644
--- a/dependencies.yaml
+++ b/dependencies.yaml
@@ -15,6 +15,7 @@ files:
       - depends_on_cupy
       - depends_on_distributed_ucxx
       - depends_on_rmm
+      - depends_on_rapids_logger
       - develop
       - docs
       - rapids_build_skbuild
@@ -65,6 +66,7 @@ files:
     includes:
       - build_common
       - depends_on_librmm
+      - depends_on_rapids_logger
   py_run_libraft:
     output: pyproject
     pyproject_dir: python/libraft
@@ -72,6 +74,8 @@ files:
       table: project
     includes:
       - cuda_wheels
+      - depends_on_librmm
+      - depends_on_rapids_logger
   py_build_pylibraft:
     output: pyproject
     pyproject_dir: python/pylibraft
@@ -178,7 +182,6 @@ dependencies:
           - cxx-compiler
           - libucxx==0.43.*,>=0.0.0a0
           - nccl>=2.19
-          - spdlog>=1.14.1,<1.15
     specific:
       - output_types: conda
         matrices:
@@ -400,6 +403,15 @@ dependencies:
               - cupy-cuda11x>=12.0.0
           - {matrix: null, packages: [cupy-cuda11x>=12.0.0]}
 
+  depends_on_rapids_logger:
+    common:
+      - output_types: [conda, requirements, pyproject]
+        packages:
+          - rapids-logger==0.1.*,>=0.0.0a0
+      - output_types: requirements
+        packages:
+          # pip recognizes the index as a global option for the requirements.txt file
+          - --extra-index-url=https://pypi.anaconda.org/rapidsai-wheels-nightly/simple
   test_libraft:
     common:
       - output_types: [conda]
diff --git a/python/libraft/libraft/load.py b/python/libraft/libraft/load.py
index ad3db9e09c..8e6bc2ef6d 100644
--- a/python/libraft/libraft/load.py
+++ b/python/libraft/libraft/load.py
@@ -45,6 +45,15 @@ def _load_wheel_installation(soname: str):
 
 def load_library():
     """Dynamically load libraft.so and its dependencies"""
+    try:
+        import librmm
+        import rapids_logger
+
+        librmm.load_library()
+        rapids_logger.load_library()
+    except ModuleNotFoundError:
+        pass
+
     prefer_system_installation = (
         os.getenv("RAPIDS_LIBRAFT_PREFER_SYSTEM_LIBRARY", "false").lower()
         != "false"
diff --git a/python/libraft/pyproject.toml b/python/libraft/pyproject.toml
index 846c6c328b..8ef419282c 100644
--- a/python/libraft/pyproject.toml
+++ b/python/libraft/pyproject.toml
@@ -31,10 +31,12 @@ authors = [
 license = { text = "Apache 2.0" }
 requires-python = ">=3.10"
 dependencies = [
+    "librmm==25.4.*,>=0.0.0a0",
     "nvidia-cublas",
     "nvidia-curand",
     "nvidia-cusolver",
     "nvidia-cusparse",
+    "rapids-logger==0.1.*,>=0.0.0a0",
 ] # This list was generated by `rapids-dependency-file-generator`. To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
 classifiers = [
     "Intended Audience :: Developers",
@@ -104,6 +106,7 @@ requires = [
     "cmake>=3.26.4,!=3.30.0",
     "librmm==25.4.*,>=0.0.0a0",
     "ninja",
+    "rapids-logger==0.1.*,>=0.0.0a0",
 ] # This list was generated by `rapids-dependency-file-generator`. To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
 dependencies-file = "../../dependencies.yaml"
 matrix-entry = "cuda_suffixed=true;use_cuda_wheels=true"

From 9f85570f9239642985d5b082cda502a49fb34927 Mon Sep 17 00:00:00 2001
From: Ben Frederickson <ben@benfrederickson.com>
Date: Tue, 11 Feb 2025 15:19:57 -0800
Subject: [PATCH 08/11] `#include <numeric>` for `std::iota` (#2578)

I'm seeing CI failures in cuvs over

```
/home/coder/.conda/envs/rapids/include/raft/comms/nccl_clique.hpp(68): error: namespace "std" has no member "iota"
      std::iota(device_ids_.begin(), device_ids_.end(), 0);
```

This seems to be be cause we're using `std::iota` without including the header. Fix.

Authors:
  - Ben Frederickson (https://github.com/benfred)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - William Hicks (https://github.com/wphicks)

URL: https://github.com/rapidsai/raft/pull/2578
---
 cpp/bench/prims/core/copy.cu                     | 2 +-
 cpp/include/raft/comms/nccl_clique.hpp           | 2 ++
 cpp/include/raft/neighbors/detail/nn_descent.cuh | 1 +
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/cpp/bench/prims/core/copy.cu b/cpp/bench/prims/core/copy.cu
index 4898f560cb..804ebd6c02 100644
--- a/cpp/bench/prims/core/copy.cu
+++ b/cpp/bench/prims/core/copy.cu
@@ -13,7 +13,6 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-
 #include <common/benchmark.hpp>
 
 #include <raft/core/copy.cuh>
@@ -26,6 +25,7 @@
 #include <raft/thirdparty/mdspan/include/experimental/mdspan>
 
 #include <cstdint>
+#include <numeric>
 
 namespace raft::bench::core {
 
diff --git a/cpp/include/raft/comms/nccl_clique.hpp b/cpp/include/raft/comms/nccl_clique.hpp
index c6520af753..08499c9c02 100644
--- a/cpp/include/raft/comms/nccl_clique.hpp
+++ b/cpp/include/raft/comms/nccl_clique.hpp
@@ -21,6 +21,8 @@
 
 #include <nccl.h>
 
+#include <numeric>
+
 /**
  * @brief Error checking macro for NCCL runtime API functions.
  *
diff --git a/cpp/include/raft/neighbors/detail/nn_descent.cuh b/cpp/include/raft/neighbors/detail/nn_descent.cuh
index 64e4a3ea7a..3568bce7f4 100644
--- a/cpp/include/raft/neighbors/detail/nn_descent.cuh
+++ b/cpp/include/raft/neighbors/detail/nn_descent.cuh
@@ -50,6 +50,7 @@
 #include <omp.h>
 
 #include <limits>
+#include <numeric>
 #include <optional>
 #include <queue>
 #include <random>

From 934586010f9114f595e37dade0639783701d25b9 Mon Sep 17 00:00:00 2001
From: Victor Lafargue <viclafargue@nvidia.com>
Date: Wed, 12 Feb 2025 01:46:48 +0100
Subject: [PATCH 09/11] Allow some of the sparse utility functions to handle
 larger matrices (#2541)

Answers https://github.com/rapidsai/cuml/issues/6204

Authors:
  - Victor Lafargue (https://github.com/viclafargue)
  - Corey J. Nolet (https://github.com/cjnolet)
  - Dante Gama Dessavre (https://github.com/dantegd)
  - Divye Gala (https://github.com/divyegala)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: https://github.com/rapidsai/raft/pull/2541
---
 .../raft/cluster/detail/connectivities.cuh    |  20 +--
 cpp/include/raft/cluster/single_linkage.cuh   |   8 +-
 cpp/include/raft/linalg/detail/lanczos.cuh    |  19 +-
 cpp/include/raft/sparse/convert/coo.cuh       |   6 +-
 cpp/include/raft/sparse/convert/csr.cuh       |   8 +-
 .../raft/sparse/convert/detail/coo.cuh        |   4 +-
 .../raft/sparse/convert/detail/csr.cuh        |  14 +-
 cpp/include/raft/sparse/coo.hpp               |   4 +-
 cpp/include/raft/sparse/detail/coo.cuh        |  29 ++--
 cpp/include/raft/sparse/detail/utils.h        |   6 +-
 cpp/include/raft/sparse/linalg/degree.cuh     |  16 +-
 .../raft/sparse/linalg/detail/degree.cuh      |  36 ++--
 .../raft/sparse/linalg/detail/norm.cuh        |  24 +--
 .../raft/sparse/linalg/detail/spectral.cuh    |  25 ++-
 .../raft/sparse/linalg/detail/symmetrize.cuh  |  48 +++---
 cpp/include/raft/sparse/linalg/norm.cuh       |   8 +-
 cpp/include/raft/sparse/linalg/spectral.cuh   |   6 +-
 cpp/include/raft/sparse/linalg/symmetrize.cuh |  10 +-
 .../neighbors/detail/cross_component_nn.cuh   |  14 +-
 .../sparse/neighbors/detail/knn_graph.cuh     |  12 +-
 .../raft/sparse/neighbors/knn_graph.cuh       |   8 +-
 cpp/include/raft/sparse/op/detail/filter.cuh  | 120 +++++++------
 cpp/include/raft/sparse/op/detail/reduce.cuh  |  24 +--
 cpp/include/raft/sparse/op/detail/sort.h      |  15 +-
 cpp/include/raft/sparse/op/filter.cuh         |  35 ++--
 cpp/include/raft/sparse/op/reduce.cuh         |  10 +-
 cpp/include/raft/sparse/op/sort.cuh           |  15 +-
 .../raft/sparse/solver/detail/lanczos.cuh     | 163 +++++++++---------
 cpp/include/raft/sparse/solver/lanczos.cuh    |   8 +-
 .../raft/spectral/detail/matrix_wrappers.hpp  |  98 +++++------
 .../raft/spectral/detail/partition.hpp        |  16 +-
 .../raft/spectral/detail/spectral_util.cuh    |  21 +--
 cpp/include/raft/spectral/eigen_solvers.cuh   |   4 +-
 cpp/include/raft/spectral/partition.cuh       |  16 +-
 cpp/tests/linalg/eigen_solvers.cu             |  10 +-
 cpp/tests/sparse/reduce.cu                    |  12 +-
 cpp/tests/sparse/solver/lanczos.cu            |  12 +-
 cpp/tests/sparse/spectral_matrix.cu           |  19 +-
 cpp/tests/sparse/symmetrize.cu                |  20 +--
 39 files changed, 496 insertions(+), 447 deletions(-)

diff --git a/cpp/include/raft/cluster/detail/connectivities.cuh b/cpp/include/raft/cluster/detail/connectivities.cuh
index c527b754c3..86bae07711 100644
--- a/cpp/include/raft/cluster/detail/connectivities.cuh
+++ b/cpp/include/raft/cluster/detail/connectivities.cuh
@@ -43,8 +43,8 @@ template <raft::cluster::LinkageDistance dist_type, typename value_idx, typename
 struct distance_graph_impl {
   void run(raft::resources const& handle,
            const value_t* X,
-           size_t m,
-           size_t n,
+           value_idx m,
+           value_idx n,
            raft::distance::DistanceType metric,
            rmm::device_uvector<value_idx>& indptr,
            rmm::device_uvector<value_idx>& indices,
@@ -61,8 +61,8 @@ template <typename value_idx, typename value_t>
 struct distance_graph_impl<raft::cluster::LinkageDistance::KNN_GRAPH, value_idx, value_t> {
   void run(raft::resources const& handle,
            const value_t* X,
-           size_t m,
-           size_t n,
+           value_idx m,
+           value_idx n,
            raft::distance::DistanceType metric,
            rmm::device_uvector<value_idx>& indptr,
            rmm::device_uvector<value_idx>& indices,
@@ -130,8 +130,8 @@ RAFT_KERNEL fill_indices2(value_idx* indices, size_t m, size_t nnz)
 template <typename value_idx, typename value_t>
 void pairwise_distances(const raft::resources& handle,
                         const value_t* X,
-                        size_t m,
-                        size_t n,
+                        value_idx m,
+                        value_idx n,
                         raft::distance::DistanceType metric,
                         value_idx* indptr,
                         value_idx* indices,
@@ -178,8 +178,8 @@ template <typename value_idx, typename value_t>
 struct distance_graph_impl<raft::cluster::LinkageDistance::PAIRWISE, value_idx, value_t> {
   void run(const raft::resources& handle,
            const value_t* X,
-           size_t m,
-           size_t n,
+           value_idx m,
+           value_idx n,
            raft::distance::DistanceType metric,
            rmm::device_uvector<value_idx>& indptr,
            rmm::device_uvector<value_idx>& indices,
@@ -216,8 +216,8 @@ struct distance_graph_impl<raft::cluster::LinkageDistance::PAIRWISE, value_idx,
 template <typename value_idx, typename value_t, raft::cluster::LinkageDistance dist_type>
 void get_distance_graph(raft::resources const& handle,
                         const value_t* X,
-                        size_t m,
-                        size_t n,
+                        value_idx m,
+                        value_idx n,
                         raft::distance::DistanceType metric,
                         rmm::device_uvector<value_idx>& indptr,
                         rmm::device_uvector<value_idx>& indices,
diff --git a/cpp/include/raft/cluster/single_linkage.cuh b/cpp/include/raft/cluster/single_linkage.cuh
index 067445c542..de56386b96 100644
--- a/cpp/include/raft/cluster/single_linkage.cuh
+++ b/cpp/include/raft/cluster/single_linkage.cuh
@@ -52,8 +52,8 @@ template <typename value_idx,
           LinkageDistance dist_type = LinkageDistance::KNN_GRAPH>
 [[deprecated("Use cuVS instead")]] void single_linkage(raft::resources const& handle,
                                                        const value_t* X,
-                                                       size_t m,
-                                                       size_t n,
+                                                       value_idx m,
+                                                       value_idx n,
                                                        raft::distance::DistanceType metric,
                                                        linkage_output<value_idx>* out,
                                                        int c,
@@ -103,8 +103,8 @@ template <typename value_t, typename idx_t, LinkageDistance dist_type = LinkageD
   raft::cluster::single_linkage<idx_t, value_t, dist_type>(
     handle,
     X.data_handle(),
-    static_cast<std::size_t>(X.extent(0)),
-    static_cast<std::size_t>(X.extent(1)),
+    X.extent(0),
+    X.extent(1),
     metric,
     &out_arrs,
     c.has_value() ? c.value() : DEFAULT_CONST_C,
diff --git a/cpp/include/raft/linalg/detail/lanczos.cuh b/cpp/include/raft/linalg/detail/lanczos.cuh
index 134ef3ef36..06c3cb1357 100644
--- a/cpp/include/raft/linalg/detail/lanczos.cuh
+++ b/cpp/include/raft/linalg/detail/lanczos.cuh
@@ -745,10 +745,10 @@ static int lanczosRestart(raft::resources const& handle,
  *  @param seed random seed.
  *  @return error flag.
  */
-template <typename index_type_t, typename value_type_t>
+template <typename index_type_t, typename value_type_t, typename nnz_type_t>
 int computeSmallestEigenvectors(
   raft::resources const& handle,
-  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t> const* A,
+  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t, nnz_type_t> const* A,
   index_type_t nEigVecs,
   index_type_t maxIter,
   index_type_t restartIter,
@@ -986,10 +986,10 @@ int computeSmallestEigenvectors(
   return 0;
 }
 
-template <typename index_type_t, typename value_type_t>
+template <typename index_type_t, typename value_type_t, typename nnz_type_t>
 int computeSmallestEigenvectors(
   raft::resources const& handle,
-  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t> const& A,
+  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t, nnz_type_t> const& A,
   index_type_t nEigVecs,
   index_type_t maxIter,
   index_type_t restartIter,
@@ -1004,7 +1004,8 @@ int computeSmallestEigenvectors(
   index_type_t n = A.nrows_;
 
   // Check that parameters are valid
-  RAFT_EXPECTS(nEigVecs > 0 && nEigVecs <= n, "Invalid number of eigenvectors.");
+  RAFT_EXPECTS(nEigVecs > 0 && static_cast<nnz_type_t>(nEigVecs) <= n,
+               "Invalid number of eigenvectors.");
   RAFT_EXPECTS(restartIter > 0, "Invalid restartIter.");
   RAFT_EXPECTS(tol > 0, "Invalid tolerance.");
   RAFT_EXPECTS(maxIter >= nEigVecs, "Invalid maxIter.");
@@ -1089,10 +1090,10 @@ int computeSmallestEigenvectors(
  *  @param seed random seed.
  *  @return error flag.
  */
-template <typename index_type_t, typename value_type_t>
+template <typename index_type_t, typename value_type_t, typename nnz_type_t>
 int computeLargestEigenvectors(
   raft::resources const& handle,
-  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t> const* A,
+  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t, nnz_type_t> const* A,
   index_type_t nEigVecs,
   index_type_t maxIter,
   index_type_t restartIter,
@@ -1333,10 +1334,10 @@ int computeLargestEigenvectors(
   return 0;
 }
 
-template <typename index_type_t, typename value_type_t>
+template <typename index_type_t, typename value_type_t, typename nnz_type_t>
 int computeLargestEigenvectors(
   raft::resources const& handle,
-  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t> const& A,
+  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t, nnz_type_t> const& A,
   index_type_t nEigVecs,
   index_type_t maxIter,
   index_type_t restartIter,
diff --git a/cpp/include/raft/sparse/convert/coo.cuh b/cpp/include/raft/sparse/convert/coo.cuh
index ba3efc7ff0..69f7864602 100644
--- a/cpp/include/raft/sparse/convert/coo.cuh
+++ b/cpp/include/raft/sparse/convert/coo.cuh
@@ -32,11 +32,11 @@ namespace convert {
  * @param nnz: size of output COO row array
  * @param stream: cuda stream to use
  */
-template <typename value_idx = int>
+template <typename value_idx = int, typename nnz_t>
 void csr_to_coo(
-  const value_idx* row_ind, value_idx m, value_idx* coo_rows, value_idx nnz, cudaStream_t stream)
+  const value_idx* row_ind, value_idx m, value_idx* coo_rows, nnz_t nnz, cudaStream_t stream)
 {
-  detail::csr_to_coo<value_idx, 32>(row_ind, m, coo_rows, nnz, stream);
+  detail::csr_to_coo<value_idx, nnz_t, 32>(row_ind, m, coo_rows, nnz, stream);
 }
 
 };  // end NAMESPACE convert
diff --git a/cpp/include/raft/sparse/convert/csr.cuh b/cpp/include/raft/sparse/convert/csr.cuh
index 73d099a719..62137d04b7 100644
--- a/cpp/include/raft/sparse/convert/csr.cuh
+++ b/cpp/include/raft/sparse/convert/csr.cuh
@@ -54,8 +54,8 @@ void coo_to_csr(raft::resources const& handle,
  * @param m: number of rows in dense matrix
  * @param stream: cuda stream to use
  */
-template <typename T>
-void sorted_coo_to_csr(const T* rows, int nnz, T* row_ind, int m, cudaStream_t stream)
+template <typename T, typename nnz_type, typename outT>
+void sorted_coo_to_csr(const T* rows, nnz_type nnz, outT* row_ind, int m, cudaStream_t stream)
 {
   detail::sorted_coo_to_csr(rows, nnz, row_ind, m, stream);
 }
@@ -67,8 +67,8 @@ void sorted_coo_to_csr(const T* rows, int nnz, T* row_ind, int m, cudaStream_t s
  * @param row_ind: output row indices array
  * @param stream: cuda stream to use
  */
-template <typename T>
-void sorted_coo_to_csr(COO<T>* coo, int* row_ind, cudaStream_t stream)
+template <typename T, typename outT>
+void sorted_coo_to_csr(COO<T>* coo, outT* row_ind, cudaStream_t stream)
 {
   detail::sorted_coo_to_csr(coo->rows(), coo->nnz, row_ind, coo->n_rows, stream);
 }
diff --git a/cpp/include/raft/sparse/convert/detail/coo.cuh b/cpp/include/raft/sparse/convert/detail/coo.cuh
index 469dac3c86..009762b4df 100644
--- a/cpp/include/raft/sparse/convert/detail/coo.cuh
+++ b/cpp/include/raft/sparse/convert/detail/coo.cuh
@@ -60,9 +60,9 @@ RAFT_KERNEL csr_to_coo_kernel(const value_idx* row_ind,
  * @param nnz: size of output COO row array
  * @param stream: cuda stream to use
  */
-template <typename value_idx = int, int TPB_X = 32>
+template <typename value_idx = int, typename nnz_t, int TPB_X = 32>
 void csr_to_coo(
-  const value_idx* row_ind, value_idx m, value_idx* coo_rows, value_idx nnz, cudaStream_t stream)
+  const value_idx* row_ind, value_idx m, value_idx* coo_rows, nnz_t nnz, cudaStream_t stream)
 {
   // @TODO: Use cusparse for this.
   dim3 grid(raft::ceildiv(m, (value_idx)TPB_X), 1, 1);
diff --git a/cpp/include/raft/sparse/convert/detail/csr.cuh b/cpp/include/raft/sparse/convert/detail/csr.cuh
index a5d7de9a07..64ed1bbeea 100644
--- a/cpp/include/raft/sparse/convert/detail/csr.cuh
+++ b/cpp/include/raft/sparse/convert/detail/csr.cuh
@@ -84,18 +84,18 @@ void coo_to_csr(raft::resources const& handle,
  * @param m: number of rows in dense matrix
  * @param stream: cuda stream to use
  */
-template <typename T>
-void sorted_coo_to_csr(const T* rows, int nnz, T* row_ind, int m, cudaStream_t stream)
+template <typename T, typename nnz_t, typename outT>
+void sorted_coo_to_csr(const T* rows, nnz_t nnz, outT* row_ind, int m, cudaStream_t stream)
 {
-  rmm::device_uvector<T> row_counts(m, stream);
-
-  RAFT_CUDA_TRY(cudaMemsetAsync(row_counts.data(), 0, m * sizeof(T), stream));
+  rmm::device_uvector<outT> row_counts(m, stream);
+  RAFT_CUDA_TRY(
+    cudaMemsetAsync(row_counts.data(), 0, static_cast<nnz_t>(m) * sizeof(outT), stream));
 
   linalg::coo_degree(rows, nnz, row_counts.data(), stream);
 
   // create csr compressed row index from row counts
-  thrust::device_ptr<T> row_counts_d = thrust::device_pointer_cast(row_counts.data());
-  thrust::device_ptr<T> c_ind_d      = thrust::device_pointer_cast(row_ind);
+  thrust::device_ptr<outT> row_counts_d = thrust::device_pointer_cast(row_counts.data());
+  thrust::device_ptr<outT> c_ind_d      = thrust::device_pointer_cast(row_ind);
   exclusive_scan(rmm::exec_policy(stream), row_counts_d, row_counts_d + m, c_ind_d);
 }
 
diff --git a/cpp/include/raft/sparse/coo.hpp b/cpp/include/raft/sparse/coo.hpp
index a176fefc3e..1c61117dab 100644
--- a/cpp/include/raft/sparse/coo.hpp
+++ b/cpp/include/raft/sparse/coo.hpp
@@ -39,8 +39,8 @@ namespace sparse {
  * @tparam value_idx: the type of index array
  *
  */
-template <typename value_t, typename value_idx = int>
-using COO = detail::COO<value_t, value_idx>;
+template <typename value_t, typename value_idx = int, typename nnz_t = uint64_t>
+using COO = detail::COO<value_t, value_idx, nnz_t>;
 
 };  // namespace sparse
 };  // namespace raft
diff --git a/cpp/include/raft/sparse/detail/coo.cuh b/cpp/include/raft/sparse/detail/coo.cuh
index 9a38c11a07..5e1514d228 100644
--- a/cpp/include/raft/sparse/detail/coo.cuh
+++ b/cpp/include/raft/sparse/detail/coo.cuh
@@ -44,7 +44,7 @@ namespace detail {
  * @tparam Index_Type: the type of index array
  *
  */
-template <typename T, typename Index_Type = int>
+template <typename T, typename Index_Type = int, typename nnz_type = uint64_t>
 class COO {
  protected:
   rmm::device_uvector<Index_Type> rows_arr;
@@ -52,7 +52,11 @@ class COO {
   rmm::device_uvector<T> vals_arr;
 
  public:
-  Index_Type nnz;
+  using value_t = T;
+  using index_t = Index_Type;
+  using nnz_t   = nnz_type;
+
+  nnz_type nnz;
   Index_Type n_rows;
   Index_Type n_cols;
 
@@ -75,7 +79,7 @@ class COO {
   COO(rmm::device_uvector<Index_Type>& rows,
       rmm::device_uvector<Index_Type>& cols,
       rmm::device_uvector<T>& vals,
-      Index_Type nnz,
+      nnz_type nnz,
       Index_Type n_rows = 0,
       Index_Type n_cols = 0)
     : rows_arr(rows), cols_arr(cols), vals_arr(vals), nnz(nnz), n_rows(n_rows), n_cols(n_cols)
@@ -90,7 +94,7 @@ class COO {
    * @param init: initialize arrays with zeros
    */
   COO(cudaStream_t stream,
-      Index_Type nnz,
+      nnz_type nnz,
       Index_Type n_rows = 0,
       Index_Type n_cols = 0,
       bool init         = true)
@@ -121,7 +125,7 @@ class COO {
    */
   bool validate_size() const
   {
-    if (this->nnz < 0 || n_rows < 0 || n_cols < 0) return false;
+    if (this->nnz <= 0 || n_rows <= 0 || n_cols <= 0) return false;
     return true;
   }
 
@@ -156,7 +160,7 @@ class COO {
   /**
    * @brief Send human-readable state information to output stream
    */
-  friend std::ostream& operator<<(std::ostream& out, const COO<T, Index_Type>& c)
+  friend std::ostream& operator<<(std::ostream& out, const COO<T, Index_Type, nnz_type>& c)
   {
     if (c.validate_size() && c.validate_mem()) {
       cudaStream_t stream;
@@ -204,7 +208,7 @@ class COO {
    * @param init: should values be initialized to 0?
    * @param stream: CUDA stream to use
    */
-  void allocate(Index_Type nnz, bool init, cudaStream_t stream)
+  void allocate(nnz_type nnz, bool init, cudaStream_t stream)
   {
     this->allocate(nnz, 0, init, stream);
   }
@@ -216,7 +220,7 @@ class COO {
    * @param init: should values be initialized to 0?
    * @param stream: CUDA stream to use
    */
-  void allocate(Index_Type nnz, Index_Type size, bool init, cudaStream_t stream)
+  void allocate(nnz_type nnz, Index_Type size, bool init, cudaStream_t stream)
   {
     this->allocate(nnz, size, size, init, stream);
   }
@@ -229,16 +233,15 @@ class COO {
    * @param init: should values be initialized to 0?
    * @param stream: stream to use for init
    */
-  void allocate(
-    Index_Type nnz, Index_Type n_rows, Index_Type n_cols, bool init, cudaStream_t stream)
+  void allocate(nnz_type nnz, Index_Type n_rows, Index_Type n_cols, bool init, cudaStream_t stream)
   {
     this->n_rows = n_rows;
     this->n_cols = n_cols;
     this->nnz    = nnz;
 
-    this->rows_arr.resize(this->nnz, stream);
-    this->cols_arr.resize(this->nnz, stream);
-    this->vals_arr.resize(this->nnz, stream);
+    this->rows_arr.resize(nnz, stream);
+    this->cols_arr.resize(nnz, stream);
+    this->vals_arr.resize(nnz, stream);
 
     if (init) init_arrays(stream);
   }
diff --git a/cpp/include/raft/sparse/detail/utils.h b/cpp/include/raft/sparse/detail/utils.h
index 3eed74f3b4..16db863a2d 100644
--- a/cpp/include/raft/sparse/detail/utils.h
+++ b/cpp/include/raft/sparse/detail/utils.h
@@ -103,10 +103,10 @@ void iota_fill(value_idx* indices, value_idx nrows, value_idx ncols, cudaStream_
   iota_fill_block_kernel<<<nrows, blockdim, 0, stream>>>(indices, ncols);
 }
 
-template <typename T>
-__device__ int get_stop_idx(T row, T m, T nnz, const T* ind)
+template <typename T, typename indT>
+__device__ indT get_stop_idx(T row, T m, indT nnz, const indT* ind)
 {
-  int stop_idx = 0;
+  indT stop_idx = 0;
   if (row < (m - 1))
     stop_idx = ind[row + 1];
   else
diff --git a/cpp/include/raft/sparse/linalg/degree.cuh b/cpp/include/raft/sparse/linalg/degree.cuh
index 8ac97259da..dde811ee2d 100644
--- a/cpp/include/raft/sparse/linalg/degree.cuh
+++ b/cpp/include/raft/sparse/linalg/degree.cuh
@@ -33,8 +33,8 @@ namespace linalg {
  * @param results: output result array
  * @param stream: cuda stream to use
  */
-template <typename T = int>
-void coo_degree(const T* rows, int nnz, T* results, cudaStream_t stream)
+template <typename T = int, typename nnz_type, typename outT>
+void coo_degree(const T* rows, nnz_type nnz, outT* results, cudaStream_t stream)
 {
   detail::coo_degree<64, T>(rows, nnz, results, stream);
 }
@@ -47,8 +47,8 @@ void coo_degree(const T* rows, int nnz, T* results, cudaStream_t stream)
  * @param results: output array with row counts (size=in->n_rows)
  * @param stream: cuda stream to use
  */
-template <typename T>
-void coo_degree(COO<T>* in, int* results, cudaStream_t stream)
+template <typename T, typename outT>
+void coo_degree(COO<T>* in, outT* results, cudaStream_t stream)
 {
   coo_degree(in->rows(), in->nnz, results, stream);
 }
@@ -64,9 +64,9 @@ void coo_degree(COO<T>* in, int* results, cudaStream_t stream)
  * @param results: output row counts
  * @param stream: cuda stream to use
  */
-template <typename T>
+template <typename T, typename nnz_type, typename outT>
 void coo_degree_scalar(
-  const int* rows, const T* vals, int nnz, T scalar, int* results, cudaStream_t stream = 0)
+  const int* rows, const T* vals, nnz_type nnz, T scalar, outT* results, cudaStream_t stream = 0)
 {
   detail::coo_degree_scalar<64>(rows, vals, nnz, scalar, results, stream);
 }
@@ -80,8 +80,8 @@ void coo_degree_scalar(
  * @param results: output row counts
  * @param stream: cuda stream to use
  */
-template <typename T>
-void coo_degree_scalar(COO<T>* in, T scalar, int* results, cudaStream_t stream)
+template <typename T, typename outT>
+void coo_degree_scalar(COO<T>* in, T scalar, outT* results, cudaStream_t stream)
 {
   coo_degree_scalar(in->rows(), in->vals(), in->nnz, scalar, results, stream);
 }
diff --git a/cpp/include/raft/sparse/linalg/detail/degree.cuh b/cpp/include/raft/sparse/linalg/detail/degree.cuh
index df31192cf7..d51188c54c 100644
--- a/cpp/include/raft/sparse/linalg/detail/degree.cuh
+++ b/cpp/include/raft/sparse/linalg/detail/degree.cuh
@@ -39,11 +39,11 @@ namespace detail {
  * @param nnz the size of the rows array
  * @param results array to place results
  */
-template <int TPB_X = 64, typename T = int>
-RAFT_KERNEL coo_degree_kernel(const T* rows, int nnz, T* results)
+template <int TPB_X = 64, typename T = int, typename outT, typename nnz_t>
+RAFT_KERNEL coo_degree_kernel(const T* rows, nnz_t nnz, outT* results)
 {
-  int row = (blockIdx.x * TPB_X) + threadIdx.x;
-  if (row < nnz) { atomicAdd(results + rows[row], (T)1); }
+  nnz_t row = (blockIdx.x * static_cast<nnz_t>(TPB_X)) + threadIdx.x;
+  if (row < nnz) { atomicAdd(results + rows[row], (outT)1); }
 }
 
 /**
@@ -54,29 +54,29 @@ RAFT_KERNEL coo_degree_kernel(const T* rows, int nnz, T* results)
  * @param results: output result array
  * @param stream: cuda stream to use
  */
-template <int TPB_X = 64, typename T = int>
-void coo_degree(const T* rows, int nnz, T* results, cudaStream_t stream)
+template <int TPB_X = 64, typename T = int, typename outT, typename nnz_t>
+void coo_degree(const T* rows, nnz_t nnz, outT* results, cudaStream_t stream)
 {
-  dim3 grid_rc(raft::ceildiv(nnz, TPB_X), 1, 1);
+  dim3 grid_rc(raft::ceildiv((nnz_t)nnz, (nnz_t)TPB_X), 1, 1);
   dim3 blk_rc(TPB_X, 1, 1);
 
   coo_degree_kernel<TPB_X><<<grid_rc, blk_rc, 0, stream>>>(rows, nnz, results);
   RAFT_CUDA_TRY(cudaGetLastError());
 }
 
-template <int TPB_X = 64, typename T>
-RAFT_KERNEL coo_degree_nz_kernel(const int* rows, const T* vals, int nnz, int* results)
+template <int TPB_X = 64, typename T, typename nnz_t>
+RAFT_KERNEL coo_degree_nz_kernel(const int* rows, const T* vals, nnz_t nnz, int* results)
 {
   int row = (blockIdx.x * TPB_X) + threadIdx.x;
   if (row < nnz && vals[row] != 0.0) { raft::myAtomicAdd(results + rows[row], 1); }
 }
 
-template <int TPB_X = 64, typename T>
+template <int TPB_X = 64, typename T, typename outT, typename nnz_t>
 RAFT_KERNEL coo_degree_scalar_kernel(
-  const int* rows, const T* vals, int nnz, T scalar, int* results)
+  const int* rows, const T* vals, nnz_t nnz, T scalar, outT* results)
 {
-  int row = (blockIdx.x * TPB_X) + threadIdx.x;
-  if (row < nnz && vals[row] != scalar) { raft::myAtomicAdd(results + rows[row], 1); }
+  nnz_t row = (blockIdx.x * static_cast<nnz_t>(TPB_X)) + threadIdx.x;
+  if (row < nnz && vals[row] != scalar) { raft::myAtomicAdd((outT*)results + rows[row], (outT)1); }
 }
 
 /**
@@ -90,11 +90,11 @@ RAFT_KERNEL coo_degree_scalar_kernel(
  * @param results: output row counts
  * @param stream: cuda stream to use
  */
-template <int TPB_X = 64, typename T>
+template <int TPB_X = 64, typename T, typename outT, typename nnz_t>
 void coo_degree_scalar(
-  const int* rows, const T* vals, int nnz, T scalar, int* results, cudaStream_t stream = 0)
+  const int* rows, const T* vals, nnz_t nnz, T scalar, outT* results, cudaStream_t stream = 0)
 {
-  dim3 grid_rc(raft::ceildiv(nnz, TPB_X), 1, 1);
+  dim3 grid_rc(raft::ceildiv(nnz, static_cast<nnz_t>(TPB_X)), 1, 1);
   dim3 blk_rc(TPB_X, 1, 1);
   coo_degree_scalar_kernel<TPB_X, T>
     <<<grid_rc, blk_rc, 0, stream>>>(rows, vals, nnz, scalar, results);
@@ -110,8 +110,8 @@ void coo_degree_scalar(
  * @param results: output row counts
  * @param stream: cuda stream to use
  */
-template <int TPB_X = 64, typename T>
-void coo_degree_nz(const int* rows, const T* vals, int nnz, int* results, cudaStream_t stream)
+template <int TPB_X = 64, typename T, typename nnz_t>
+void coo_degree_nz(const int* rows, const T* vals, nnz_t nnz, int* results, cudaStream_t stream)
 {
   dim3 grid_rc(raft::ceildiv(nnz, TPB_X), 1, 1);
   dim3 blk_rc(TPB_X, 1, 1);
diff --git a/cpp/include/raft/sparse/linalg/detail/norm.cuh b/cpp/include/raft/sparse/linalg/detail/norm.cuh
index 2619048388..0390fb5f69 100644
--- a/cpp/include/raft/sparse/linalg/detail/norm.cuh
+++ b/cpp/include/raft/sparse/linalg/detail/norm.cuh
@@ -40,15 +40,15 @@ namespace sparse {
 namespace linalg {
 namespace detail {
 
-template <int TPB_X = 64, typename T>
+template <int TPB_X = 64, typename T, typename indT>
 RAFT_KERNEL csr_row_normalize_l1_kernel(
   // @TODO: This can be done much more parallel by
   // having threads in a warp compute the sum in parallel
   // over each row and then divide the values in parallel.
-  const int* ia,  // csr row ex_scan (sorted by row)
+  const indT* ia,  // csr row ex_scan (sorted by row)
   const T* vals,
-  int nnz,  // array of values and number of non-zeros
-  int m,    // num rows in csr
+  indT nnz,  // array of values and number of non-zeros
+  int m,     // num rows in csr
   T* result)
 {  // output array
 
@@ -57,19 +57,19 @@ RAFT_KERNEL csr_row_normalize_l1_kernel(
 
   // sum all vals_arr for row and divide each val by sum
   if (row < m) {
-    int start_idx = ia[row];
-    int stop_idx  = 0;
+    indT start_idx = ia[row];
+    indT stop_idx  = 0;
     if (row < m - 1) {
       stop_idx = ia[row + 1];
     } else
       stop_idx = nnz;
 
     T sum = T(0.0);
-    for (int j = start_idx; j < stop_idx; j++) {
+    for (indT j = start_idx; j < stop_idx; j++) {
       sum = sum + fabs(vals[j]);
     }
 
-    for (int j = start_idx; j < stop_idx; j++) {
+    for (indT j = start_idx; j < stop_idx; j++) {
       if (sum != 0.0) {
         T val     = vals[j];
         result[j] = val / sum;
@@ -90,11 +90,11 @@ RAFT_KERNEL csr_row_normalize_l1_kernel(
  * @param result: l1 normalized data array
  * @param stream: cuda stream to use
  */
-template <int TPB_X = 64, typename T>
-void csr_row_normalize_l1(const int* ia,  // csr row ex_scan (sorted by row)
+template <int TPB_X = 64, typename T, typename indT>
+void csr_row_normalize_l1(const indT* ia,  // csr row ex_scan (sorted by row)
                           const T* vals,
-                          int nnz,  // array of values and number of non-zeros
-                          int m,    // num rows in csr
+                          indT nnz,  // array of values and number of non-zeros
+                          int m,     // num rows in csr
                           T* result,
                           cudaStream_t stream)
 {  // output array
diff --git a/cpp/include/raft/sparse/linalg/detail/spectral.cuh b/cpp/include/raft/sparse/linalg/detail/spectral.cuh
index a1642d1455..7b8cb545cf 100644
--- a/cpp/include/raft/sparse/linalg/detail/spectral.cuh
+++ b/cpp/include/raft/sparse/linalg/detail/spectral.cuh
@@ -30,13 +30,13 @@ namespace sparse {
 namespace spectral {
 namespace detail {
 
-template <typename T>
+template <typename T, typename IndT, typename nnz_t>
 void fit_embedding(raft::resources const& handle,
                    int* rows,
                    int* cols,
                    T* vals,
-                   int nnz,
-                   int n,
+                   nnz_t nnz,
+                   IndT n,
                    int n_components,
                    T* out,
                    unsigned long long seed = 1234567)
@@ -45,8 +45,15 @@ void fit_embedding(raft::resources const& handle,
   rmm::device_uvector<int> src_offsets(n + 1, stream);
   rmm::device_uvector<int> dst_cols(nnz, stream);
   rmm::device_uvector<T> dst_vals(nnz, stream);
-  convert::coo_to_csr(
-    handle, rows, cols, vals, nnz, n, src_offsets.data(), dst_cols.data(), dst_vals.data());
+  convert::coo_to_csr(handle,
+                      rows,
+                      cols,
+                      vals,
+                      static_cast<uint64_t>(nnz),
+                      static_cast<int>(n),
+                      src_offsets.data(),
+                      dst_cols.data(),
+                      dst_vals.data());
 
   rmm::device_uvector<T> eigVals(n_components + 1, stream);
   rmm::device_uvector<T> eigVecs(n * (n_components + 1), stream);
@@ -64,20 +71,20 @@ void fit_embedding(raft::resources const& handle,
   index_type* ci = dst_cols.data();
   value_type* vs = dst_vals.data();
 
-  raft::spectral::matrix::sparse_matrix_t<index_type, value_type> const r_csr_m{
-    handle, ro, ci, vs, n, nnz};
+  raft::spectral::matrix::sparse_matrix_t<index_type, value_type, nnz_t> const r_csr_m{
+    handle, ro, ci, vs, static_cast<index_type>(n), nnz};
 
   index_type neigvs       = n_components + 1;
   index_type maxiter      = 4000;  // default reset value (when set to 0);
   value_type tol          = 0.01;
   index_type restart_iter = 15 + neigvs;  // what cugraph is using
 
-  raft::spectral::eigen_solver_config_t<index_type, value_type> cfg{
+  raft::spectral::eigen_solver_config_t<index_type, value_type, nnz_t> cfg{
     neigvs, maxiter, restart_iter, tol};
 
   cfg.seed = seed;
 
-  raft::spectral::lanczos_solver_t<index_type, value_type> eig_solver{cfg};
+  raft::spectral::lanczos_solver_t<index_type, value_type, nnz_t> eig_solver{cfg};
 
   // cluster computation here is irrelevant,
   // hence define a no-op such solver to
diff --git a/cpp/include/raft/sparse/linalg/detail/symmetrize.cuh b/cpp/include/raft/sparse/linalg/detail/symmetrize.cuh
index d343bcbf66..b248de855d 100644
--- a/cpp/include/raft/sparse/linalg/detail/symmetrize.cuh
+++ b/cpp/include/raft/sparse/linalg/detail/symmetrize.cuh
@@ -47,8 +47,8 @@ namespace detail {
 
 // TODO: value_idx param needs to be used for this once FAISS is updated to use float32
 // for indices so that the index types can be uniform
-template <int TPB_X = 128, typename T, typename Lambda>
-RAFT_KERNEL coo_symmetrize_kernel(int* row_ind,
+template <int TPB_X = 128, typename T, typename Lambda, typename nnz_t>
+RAFT_KERNEL coo_symmetrize_kernel(nnz_t* row_ind,
                                   int* rows,
                                   int* cols,
                                   T* vals,
@@ -56,31 +56,31 @@ RAFT_KERNEL coo_symmetrize_kernel(int* row_ind,
                                   int* ocols,
                                   T* ovals,
                                   int n,
-                                  int cnnz,
+                                  nnz_t cnnz,
                                   Lambda reduction_op)
 {
   int row = (blockIdx.x * TPB_X) + threadIdx.x;
 
   if (row < n) {
-    int start_idx = row_ind[row];  // each thread processes one row
-    int stop_idx  = get_stop_idx(row, n, cnnz, row_ind);
+    nnz_t start_idx = row_ind[row];  // each thread processes one row
+    nnz_t stop_idx  = get_stop_idx(row, n, cnnz, row_ind);
 
-    int row_nnz       = 0;
-    int out_start_idx = start_idx * 2;
+    nnz_t row_nnz       = 0;
+    nnz_t out_start_idx = start_idx * 2;
 
-    for (int idx = 0; idx < stop_idx - start_idx; idx++) {
-      int cur_row = rows[idx + start_idx];
-      int cur_col = cols[idx + start_idx];
-      T cur_val   = vals[idx + start_idx];
+    for (nnz_t idx = 0; idx < stop_idx - start_idx; idx++) {
+      int cur_row = rows[start_idx + idx];
+      int cur_col = cols[start_idx + idx];
+      T cur_val   = vals[start_idx + idx];
 
       int lookup_row = cur_col;
-      int t_start    = row_ind[lookup_row];  // Start at
-      int t_stop     = get_stop_idx(lookup_row, n, cnnz, row_ind);
+      nnz_t t_start  = row_ind[lookup_row];  // Start at
+      nnz_t t_stop   = get_stop_idx(lookup_row, n, cnnz, row_ind);
 
       T transpose = 0.0;
 
       bool found_match = false;
-      for (int t_idx = t_start; t_idx < t_stop; t_idx++) {
+      for (nnz_t t_idx = t_start; t_idx < t_stop; t_idx++) {
         // If we find a match, let's get out of the loop. We won't
         // need to modify the transposed value, since that will be
         // done in a different thread.
@@ -131,9 +131,9 @@ RAFT_KERNEL coo_symmetrize_kernel(int* row_ind,
  * @param reduction_op: a custom reduction function
  * @param stream: cuda stream to use
  */
-template <int TPB_X = 128, typename T, typename Lambda>
-void coo_symmetrize(COO<T>* in,
-                    COO<T>* out,
+template <int TPB_X = 128, typename T, typename IdxT, typename nnz_t, typename Lambda>
+void coo_symmetrize(COO<T, IdxT, nnz_t>* in,
+                    COO<T, IdxT, nnz_t>* out,
                     Lambda reduction_op,  // two-argument reducer
                     cudaStream_t stream)
 {
@@ -142,7 +142,7 @@ void coo_symmetrize(COO<T>* in,
 
   ASSERT(!out->validate_mem(), "Expecting unallocated COO for output");
 
-  rmm::device_uvector<int> in_row_ind(in->n_rows, stream);
+  rmm::device_uvector<nnz_t> in_row_ind(in->n_rows, stream);
 
   convert::sorted_coo_to_csr(in, in_row_ind.data(), stream);
 
@@ -324,15 +324,15 @@ void from_knn_symmetrize_matrix(const value_idx* __restrict__ knn_indices,
 /**
  * Symmetrizes a COO matrix
  */
-template <typename value_idx, typename value_t>
+template <typename value_idx, typename value_t, typename nnz_t>
 void symmetrize(raft::resources const& handle,
                 const value_idx* rows,
                 const value_idx* cols,
                 const value_t* vals,
-                size_t m,
-                size_t n,
-                size_t nnz,
-                raft::sparse::COO<value_t, value_idx>& out)
+                value_idx m,
+                value_idx n,
+                nnz_t nnz,
+                raft::sparse::COO<value_t, value_idx, nnz_t>& out)
 {
   auto stream = resource::get_cuda_stream(handle);
 
@@ -352,7 +352,7 @@ void symmetrize(raft::resources const& handle,
   // sort COO
   raft::sparse::op::coo_sort((value_idx)m,
                              (value_idx)n,
-                             (value_idx)nnz * 2,
+                             static_cast<nnz_t>(nnz) * 2,
                              symm_rows.data(),
                              symm_cols.data(),
                              symm_vals.data(),
diff --git a/cpp/include/raft/sparse/linalg/norm.cuh b/cpp/include/raft/sparse/linalg/norm.cuh
index 7adf245abc..f90d088ee6 100644
--- a/cpp/include/raft/sparse/linalg/norm.cuh
+++ b/cpp/include/raft/sparse/linalg/norm.cuh
@@ -36,11 +36,11 @@ namespace linalg {
  * @param result: l1 normalized data array
  * @param stream: cuda stream to use
  */
-template <typename T>
-void csr_row_normalize_l1(const int* ia,  // csr row ex_scan (sorted by row)
+template <typename T, typename indT>
+void csr_row_normalize_l1(const indT* ia,  // csr row ex_scan (sorted by row)
                           const T* vals,
-                          int nnz,  // array of values and number of non-zeros
-                          int m,    // num rows in csr
+                          indT nnz,  // array of values and number of non-zeros
+                          int m,     // num rows in csr
                           T* result,
                           cudaStream_t stream)
 {  // output array
diff --git a/cpp/include/raft/sparse/linalg/spectral.cuh b/cpp/include/raft/sparse/linalg/spectral.cuh
index 276a64c125..c63d551bf2 100644
--- a/cpp/include/raft/sparse/linalg/spectral.cuh
+++ b/cpp/include/raft/sparse/linalg/spectral.cuh
@@ -23,13 +23,13 @@ namespace raft {
 namespace sparse {
 namespace spectral {
 
-template <typename T>
+template <typename T, typename IndT, typename nnz_t>
 void fit_embedding(raft::resources const& handle,
                    int* rows,
                    int* cols,
                    T* vals,
-                   int nnz,
-                   int n,
+                   nnz_t nnz,
+                   IndT n,
                    int n_components,
                    T* out,
                    unsigned long long seed = 1234567)
diff --git a/cpp/include/raft/sparse/linalg/symmetrize.cuh b/cpp/include/raft/sparse/linalg/symmetrize.cuh
index 8ee53cd3ae..64bab11233 100644
--- a/cpp/include/raft/sparse/linalg/symmetrize.cuh
+++ b/cpp/include/raft/sparse/linalg/symmetrize.cuh
@@ -148,15 +148,15 @@ void from_knn_symmetrize_matrix(const value_idx* __restrict__ knn_indices,
 /**
  * Symmetrizes a COO matrix
  */
-template <typename value_idx, typename value_t>
+template <typename value_idx, typename value_t, typename nnz_t>
 void symmetrize(raft::resources const& handle,
                 const value_idx* rows,
                 const value_idx* cols,
                 const value_t* vals,
-                size_t m,
-                size_t n,
-                size_t nnz,
-                raft::sparse::COO<value_t, value_idx>& out)
+                value_idx m,
+                value_idx n,
+                nnz_t nnz,
+                raft::sparse::COO<value_t, value_idx, nnz_t>& out)
 {
   detail::symmetrize(handle, rows, cols, vals, m, n, nnz, out);
 }
diff --git a/cpp/include/raft/sparse/neighbors/detail/cross_component_nn.cuh b/cpp/include/raft/sparse/neighbors/detail/cross_component_nn.cuh
index a47d5a6f34..3fea5cb330 100644
--- a/cpp/include/raft/sparse/neighbors/detail/cross_component_nn.cuh
+++ b/cpp/include/raft/sparse/neighbors/detail/cross_component_nn.cuh
@@ -448,10 +448,10 @@ void min_components_by_color(raft::sparse::COO<value_t, value_idx>& coo,
  * is done
  * @param[in] metric distance metric
  */
-template <typename value_idx, typename value_t, typename red_op>
+template <typename value_idx, typename value_t, typename nnz_t, typename red_op>
 void cross_component_nn(
   raft::resources const& handle,
-  raft::sparse::COO<value_t, value_idx>& out,
+  raft::sparse::COO<value_t, value_idx, nnz_t>& out,
   const value_t* X,
   const value_idx* orig_colors,
   size_t n_rows,
@@ -534,8 +534,14 @@ void cross_component_nn(
   /**
    * Symmetrize resulting edge list
    */
-  raft::sparse::linalg::symmetrize(
-    handle, min_edges.rows(), min_edges.cols(), min_edges.vals(), n_rows, n_rows, size, out);
+  raft::sparse::linalg::symmetrize(handle,
+                                   min_edges.rows(),
+                                   min_edges.cols(),
+                                   min_edges.vals(),
+                                   (value_idx)n_rows,
+                                   (value_idx)n_rows,
+                                   (nnz_t)size,
+                                   out);
 }
 
 };  // end namespace raft::sparse::neighbors::detail
diff --git a/cpp/include/raft/sparse/neighbors/detail/knn_graph.cuh b/cpp/include/raft/sparse/neighbors/detail/knn_graph.cuh
index 4e46904c83..ba007c6bb1 100644
--- a/cpp/include/raft/sparse/neighbors/detail/knn_graph.cuh
+++ b/cpp/include/raft/sparse/neighbors/detail/knn_graph.cuh
@@ -92,20 +92,20 @@ void conv_indices(in_t* inds, out_t* out, size_t size, cudaStream_t stream)
  * @param[out] out output edge list
  * @param c
  */
-template <typename value_idx = int, typename value_t = float>
+template <typename value_idx = int, typename value_t = float, typename nnz_t = size_t>
 void knn_graph(raft::resources const& handle,
                const value_t* X,
-               size_t m,
-               size_t n,
+               value_idx m,
+               value_idx n,
                raft::distance::DistanceType metric,
-               raft::sparse::COO<value_t, value_idx>& out,
+               raft::sparse::COO<value_t, value_idx, nnz_t>& out,
                int c = 15)
 {
   size_t k = build_k(m, c);
 
   auto stream = resource::get_cuda_stream(handle);
 
-  size_t nnz = m * k;
+  nnz_t nnz = m * k;
 
   rmm::device_uvector<value_idx> rows(nnz, stream);
   rmm::device_uvector<value_idx> indices(nnz, stream);
@@ -142,7 +142,7 @@ void knn_graph(raft::resources const& handle,
   conv_indices(int64_indices.data(), indices.data(), nnz, stream);
 
   raft::sparse::linalg::symmetrize(
-    handle, rows.data(), indices.data(), data.data(), m, k, nnz, out);
+    handle, rows.data(), indices.data(), data.data(), m, static_cast<int>(k), nnz, out);
 }
 
 };  // namespace raft::sparse::neighbors::detail
diff --git a/cpp/include/raft/sparse/neighbors/knn_graph.cuh b/cpp/include/raft/sparse/neighbors/knn_graph.cuh
index 8257afc16f..6f318e7991 100644
--- a/cpp/include/raft/sparse/neighbors/knn_graph.cuh
+++ b/cpp/include/raft/sparse/neighbors/knn_graph.cuh
@@ -40,13 +40,13 @@ namespace raft::sparse::neighbors {
  * @param[out] out output edge list
  * @param c
  */
-template <typename value_idx = int, typename value_t = float>
+template <typename value_idx = int, typename value_t = float, typename nnz_t = size_t>
 void knn_graph(raft::resources const& handle,
                const value_t* X,
-               std::size_t m,
-               std::size_t n,
+               value_idx m,
+               value_idx n,
                raft::distance::DistanceType metric,
-               raft::sparse::COO<value_t, value_idx>& out,
+               raft::sparse::COO<value_t, value_idx, nnz_t>& out,
                int c = 15)
 {
   detail::knn_graph(handle, X, m, n, metric, out, c);
diff --git a/cpp/include/raft/sparse/op/detail/filter.cuh b/cpp/include/raft/sparse/op/detail/filter.cuh
index 3df85e6871..db2e3b858b 100644
--- a/cpp/include/raft/sparse/op/detail/filter.cuh
+++ b/cpp/include/raft/sparse/op/detail/filter.cuh
@@ -42,31 +42,31 @@ namespace sparse {
 namespace op {
 namespace detail {
 
-template <int TPB_X, typename T>
+template <int TPB_X, typename T, typename nnz_t>
 RAFT_KERNEL coo_remove_scalar_kernel(const int* rows,
                                      const int* cols,
                                      const T* vals,
-                                     int nnz,
-                                     int* crows,
-                                     int* ccols,
-                                     T* cvals,
-                                     int* ex_scan,
-                                     int* cur_ex_scan,
+                                     nnz_t nnz,
+                                     int* out_rows,
+                                     int* out_cols,
+                                     T* out_vals,
+                                     nnz_t* ex_scan,
+                                     nnz_t* cur_ex_scan,
                                      int m,
                                      T scalar)
 {
   int row = (blockIdx.x * TPB_X) + threadIdx.x;
 
   if (row < m) {
-    int start       = cur_ex_scan[row];
-    int stop        = get_stop_idx(row, m, nnz, cur_ex_scan);
-    int cur_out_idx = ex_scan[row];
+    nnz_t start       = cur_ex_scan[row];
+    nnz_t stop        = get_stop_idx(row, m, nnz, cur_ex_scan);
+    nnz_t cur_out_idx = ex_scan[row];
 
-    for (int idx = start; idx < stop; idx++) {
+    for (nnz_t idx = start; idx < stop; idx++) {
       if (vals[idx] != scalar) {
-        crows[cur_out_idx] = rows[idx];
-        ccols[cur_out_idx] = cols[idx];
-        cvals[cur_out_idx] = vals[idx];
+        out_rows[cur_out_idx] = rows[idx];
+        out_cols[cur_out_idx] = cols[idx];
+        out_vals[cur_out_idx] = vals[idx];
         ++cur_out_idx;
       }
     }
@@ -90,33 +90,33 @@ RAFT_KERNEL coo_remove_scalar_kernel(const int* rows,
  * @param d_alloc device allocator for temporary buffers
  * @param stream: cuda stream to use
  */
-template <int TPB_X, typename T>
-void coo_remove_scalar(const int* rows,
-                       const int* cols,
+template <int TPB_X, typename T, typename idx_t, typename nnz_t>
+void coo_remove_scalar(const idx_t* rows,
+                       const idx_t* cols,
                        const T* vals,
-                       int nnz,
-                       int* crows,
-                       int* ccols,
+                       nnz_t nnz,
+                       idx_t* crows,
+                       idx_t* ccols,
                        T* cvals,
-                       int* cnnz,
-                       int* cur_cnnz,
+                       nnz_t* cnnz,
+                       nnz_t* cur_cnnz,
                        T scalar,
-                       int n,
+                       idx_t n,
                        cudaStream_t stream)
 {
-  rmm::device_uvector<int> ex_scan(n, stream);
-  rmm::device_uvector<int> cur_ex_scan(n, stream);
-
-  RAFT_CUDA_TRY(cudaMemsetAsync(ex_scan.data(), 0, n * sizeof(int), stream));
-  RAFT_CUDA_TRY(cudaMemsetAsync(cur_ex_scan.data(), 0, n * sizeof(int), stream));
-
-  thrust::device_ptr<int> dev_cnnz    = thrust::device_pointer_cast(cnnz);
-  thrust::device_ptr<int> dev_ex_scan = thrust::device_pointer_cast(ex_scan.data());
+  rmm::device_uvector<nnz_t> ex_scan(n, stream);
+  rmm::device_uvector<nnz_t> cur_ex_scan(n, stream);
+  RAFT_CUDA_TRY(cudaMemsetAsync(ex_scan.data(), 0, static_cast<nnz_t>(n) * sizeof(nnz_t), stream));
+  RAFT_CUDA_TRY(
+    cudaMemsetAsync(cur_ex_scan.data(), 0, static_cast<nnz_t>(n) * sizeof(nnz_t), stream));
+
+  thrust::device_ptr<nnz_t> dev_cnnz    = thrust::device_pointer_cast(cnnz);
+  thrust::device_ptr<nnz_t> dev_ex_scan = thrust::device_pointer_cast(ex_scan.data());
   thrust::exclusive_scan(rmm::exec_policy(stream), dev_cnnz, dev_cnnz + n, dev_ex_scan);
   RAFT_CUDA_TRY(cudaPeekAtLastError());
 
-  thrust::device_ptr<int> dev_cur_cnnz    = thrust::device_pointer_cast(cur_cnnz);
-  thrust::device_ptr<int> dev_cur_ex_scan = thrust::device_pointer_cast(cur_ex_scan.data());
+  thrust::device_ptr<nnz_t> dev_cur_cnnz    = thrust::device_pointer_cast(cur_cnnz);
+  thrust::device_ptr<nnz_t> dev_cur_ex_scan = thrust::device_pointer_cast(cur_ex_scan.data());
   thrust::exclusive_scan(rmm::exec_policy(stream), dev_cur_cnnz, dev_cur_cnnz + n, dev_cur_ex_scan);
   RAFT_CUDA_TRY(cudaPeekAtLastError());
 
@@ -145,39 +145,45 @@ void coo_remove_scalar(const int* rows,
  * @param scalar: scalar to remove from arrays
  * @param stream: cuda stream to use
  */
-template <int TPB_X, typename T>
-void coo_remove_scalar(COO<T>* in, COO<T>* out, T scalar, cudaStream_t stream)
+template <int TPB_X, typename T, typename idx_t, typename nnz_t>
+void coo_remove_scalar(COO<T, idx_t, nnz_t>* in,
+                       COO<T, idx_t, nnz_t>* out,
+                       T scalar,
+                       cudaStream_t stream)
 {
-  rmm::device_uvector<int> row_count_nz(in->n_rows, stream);
-  rmm::device_uvector<int> row_count(in->n_rows, stream);
+  rmm::device_uvector<nnz_t> row_count_nz(in->n_rows, stream);
+  rmm::device_uvector<nnz_t> row_count(in->n_rows, stream);
 
-  RAFT_CUDA_TRY(cudaMemsetAsync(row_count_nz.data(), 0, in->n_rows * sizeof(int), stream));
-  RAFT_CUDA_TRY(cudaMemsetAsync(row_count.data(), 0, in->n_rows * sizeof(int), stream));
+  RAFT_CUDA_TRY(cudaMemsetAsync(
+    row_count_nz.data(), 0, static_cast<nnz_t>(in->n_rows) * sizeof(nnz_t), stream));
+  RAFT_CUDA_TRY(
+    cudaMemsetAsync(row_count.data(), 0, static_cast<nnz_t>(in->n_rows) * sizeof(nnz_t), stream));
 
   linalg::coo_degree(in->rows(), in->nnz, row_count.data(), stream);
   RAFT_CUDA_TRY(cudaPeekAtLastError());
 
-  linalg::coo_degree_scalar(in->rows(), in->vals(), in->nnz, scalar, row_count_nz.data(), stream);
+  linalg::coo_degree_scalar(
+    in->rows(), in->vals(), in->nnz, scalar, (unsigned long long int*)row_count_nz.data(), stream);
   RAFT_CUDA_TRY(cudaPeekAtLastError());
 
-  thrust::device_ptr<int> d_row_count_nz = thrust::device_pointer_cast(row_count_nz.data());
-  int out_nnz =
+  thrust::device_ptr<nnz_t> d_row_count_nz = thrust::device_pointer_cast(row_count_nz.data());
+  nnz_t out_nnz =
     thrust::reduce(rmm::exec_policy(stream), d_row_count_nz, d_row_count_nz + in->n_rows);
 
   out->allocate(out_nnz, in->n_rows, in->n_cols, false, stream);
 
-  coo_remove_scalar<TPB_X, T>(in->rows(),
-                              in->cols(),
-                              in->vals(),
-                              in->nnz,
-                              out->rows(),
-                              out->cols(),
-                              out->vals(),
-                              row_count_nz.data(),
-                              row_count.data(),
-                              scalar,
-                              in->n_rows,
-                              stream);
+  coo_remove_scalar<TPB_X, T, idx_t, nnz_t>(in->rows(),
+                                            in->cols(),
+                                            in->vals(),
+                                            in->nnz,
+                                            out->rows(),
+                                            out->cols(),
+                                            out->vals(),
+                                            row_count_nz.data(),
+                                            row_count.data(),
+                                            scalar,
+                                            in->n_rows,
+                                            stream);
   RAFT_CUDA_TRY(cudaPeekAtLastError());
 }
 
@@ -188,10 +194,10 @@ void coo_remove_scalar(COO<T>* in, COO<T>* out, T scalar, cudaStream_t stream)
  * @param out: output COO matrix
  * @param stream: cuda stream to use
  */
-template <int TPB_X, typename T>
-void coo_remove_zeros(COO<T>* in, COO<T>* out, cudaStream_t stream)
+template <int TPB_X, typename T, typename idx_t, typename nnz_t>
+void coo_remove_zeros(COO<T, idx_t, nnz_t>* in, COO<T, idx_t, nnz_t>* out, cudaStream_t stream)
 {
-  coo_remove_scalar<TPB_X, T>(in, out, T(0.0), stream);
+  coo_remove_scalar<TPB_X, T, idx_t, nnz_t>(in, out, T(0.0), stream);
 }
 
 };  // namespace detail
diff --git a/cpp/include/raft/sparse/op/detail/reduce.cuh b/cpp/include/raft/sparse/op/detail/reduce.cuh
index 1e5dd87958..2359628b78 100644
--- a/cpp/include/raft/sparse/op/detail/reduce.cuh
+++ b/cpp/include/raft/sparse/op/detail/reduce.cuh
@@ -44,13 +44,13 @@ namespace sparse {
 namespace op {
 namespace detail {
 
-template <typename value_idx>
+template <typename value_idx, typename nnz_t>
 RAFT_KERNEL compute_duplicates_diffs_kernel(const value_idx* rows,
                                             const value_idx* cols,
                                             value_idx* diff,
-                                            size_t nnz)
+                                            nnz_t nnz)
 {
-  size_t tid = blockDim.x * blockIdx.x + threadIdx.x;
+  nnz_t tid = blockDim.x * blockIdx.x + threadIdx.x;
   if (tid >= nnz) return;
 
   value_idx d = 1;
@@ -98,13 +98,13 @@ RAFT_KERNEL max_duplicates_kernel(const value_idx* src_rows,
  * @param[in] nnz number of nonzeros in input arrays
  * @param[in] stream cuda ops will be ordered wrt this stream
  */
-template <typename value_idx>
+template <typename value_idx, typename nnz_t>
 void compute_duplicates_mask(
-  value_idx* mask, const value_idx* rows, const value_idx* cols, size_t nnz, cudaStream_t stream)
+  value_idx* mask, const value_idx* rows, const value_idx* cols, nnz_t nnz, cudaStream_t stream)
 {
   RAFT_CUDA_TRY(cudaMemsetAsync(mask, 0, nnz * sizeof(value_idx), stream));
 
-  compute_duplicates_diffs_kernel<<<raft::ceildiv(nnz, (size_t)256), 256, 0, stream>>>(
+  compute_duplicates_diffs_kernel<<<raft::ceildiv(nnz, (nnz_t)256), 256, 0, stream>>>(
     rows, cols, mask, nnz);
 }
 
@@ -124,15 +124,15 @@ void compute_duplicates_mask(
  * @param[in] n number of columns in COO input matrix
  * @param[in] stream cuda ops will be ordered wrt this stream
  */
-template <typename value_idx, typename value_t>
+template <typename value_idx, typename value_t, typename nnz_t>
 void max_duplicates(raft::resources const& handle,
-                    raft::sparse::COO<value_t, value_idx>& out,
+                    raft::sparse::COO<value_t, value_idx, nnz_t>& out,
                     const value_idx* rows,
                     const value_idx* cols,
                     const value_t* vals,
-                    size_t nnz,
-                    size_t m,
-                    size_t n)
+                    nnz_t nnz,
+                    value_idx m,
+                    value_idx n)
 {
   auto stream        = resource::get_cuda_stream(handle);
   auto thrust_policy = resource::get_thrust_policy(handle);
@@ -153,7 +153,7 @@ void max_duplicates(raft::resources const& handle,
   out.allocate(size, m, n, true, stream);
 
   // perform reduce
-  max_duplicates_kernel<<<raft::ceildiv(nnz, (size_t)256), 256, 0, stream>>>(
+  max_duplicates_kernel<<<raft::ceildiv(nnz, (nnz_t)256), 256, 0, stream>>>(
     rows, cols, vals, diff.data() + 1, out.rows(), out.cols(), out.vals(), nnz);
 }
 
diff --git a/cpp/include/raft/sparse/op/detail/sort.h b/cpp/include/raft/sparse/op/detail/sort.h
index 02287c2367..2c5337bf0e 100644
--- a/cpp/include/raft/sparse/op/detail/sort.h
+++ b/cpp/include/raft/sparse/op/detail/sort.h
@@ -68,8 +68,8 @@ struct TupleComp {
  * @param vals vals array from coo matrix
  * @param stream: cuda stream to use
  */
-template <typename T, typename IdxT = int>
-void coo_sort(IdxT m, IdxT n, IdxT nnz, IdxT* rows, IdxT* cols, T* vals, cudaStream_t stream)
+template <typename T, typename IdxT = int, typename nnz_t>
+void coo_sort(IdxT m, IdxT n, nnz_t nnz, IdxT* rows, IdxT* cols, T* vals, cudaStream_t stream)
 {
   auto coo_indices = thrust::make_zip_iterator(thrust::make_tuple(rows, cols));
 
@@ -83,10 +83,11 @@ void coo_sort(IdxT m, IdxT n, IdxT nnz, IdxT* rows, IdxT* cols, T* vals, cudaStr
  * @param in: COO to sort by row
  * @param stream: the cuda stream to use
  */
-template <typename T, typename IdxT = int>
-void coo_sort(COO<T, IdxT>* const in, cudaStream_t stream)
+template <typename T, typename IdxT = int, typename nnz_t>
+void coo_sort(COO<T, IdxT, nnz_t>* const in, cudaStream_t stream)
 {
-  coo_sort<T, IdxT>(in->n_rows, in->n_cols, in->nnz, in->rows(), in->cols(), in->vals(), stream);
+  coo_sort<T, IdxT, nnz_t>(
+    in->n_rows, in->n_cols, in->nnz, in->rows(), in->cols(), in->vals(), stream);
 }
 
 /**
@@ -99,9 +100,9 @@ void coo_sort(COO<T, IdxT>* const in, cudaStream_t stream)
  * @param[in] nnz number of edges in edge list
  * @param[in] stream cuda stream for which to order cuda operations
  */
-template <typename value_idx, typename value_t>
+template <typename value_idx, typename value_t, typename nnz_t>
 void coo_sort_by_weight(
-  value_idx* rows, value_idx* cols, value_t* data, value_idx nnz, cudaStream_t stream)
+  value_idx* rows, value_idx* cols, value_t* data, nnz_t nnz, cudaStream_t stream)
 {
   thrust::device_ptr<value_t> t_data = thrust::device_pointer_cast(data);
 
diff --git a/cpp/include/raft/sparse/op/filter.cuh b/cpp/include/raft/sparse/op/filter.cuh
index 4b329325ca..c257585c0e 100644
--- a/cpp/include/raft/sparse/op/filter.cuh
+++ b/cpp/include/raft/sparse/op/filter.cuh
@@ -42,21 +42,21 @@ namespace op {
  * @param n: number of rows in dense matrix
  * @param stream: cuda stream to use
  */
-template <typename T>
-void coo_remove_scalar(const int* rows,
-                       const int* cols,
+template <typename T, typename idx_t, typename nnz_t>
+void coo_remove_scalar(const idx_t* rows,
+                       const idx_t* cols,
                        const T* vals,
-                       int nnz,
-                       int* crows,
-                       int* ccols,
+                       nnz_t nnz,
+                       idx_t* crows,
+                       idx_t* ccols,
                        T* cvals,
-                       int* cnnz,
-                       int* cur_cnnz,
+                       nnz_t* cnnz,
+                       nnz_t* cur_cnnz,
                        T scalar,
-                       int n,
+                       idx_t n,
                        cudaStream_t stream)
 {
-  detail::coo_remove_scalar<128, T>(
+  detail::coo_remove_scalar<128, T, idx_t, nnz_t>(
     rows, cols, vals, nnz, crows, ccols, cvals, cnnz, cur_cnnz, scalar, n, stream);
 }
 
@@ -68,10 +68,13 @@ void coo_remove_scalar(const int* rows,
  * @param scalar: scalar to remove from arrays
  * @param stream: cuda stream to use
  */
-template <typename T>
-void coo_remove_scalar(COO<T>* in, COO<T>* out, T scalar, cudaStream_t stream)
+template <typename T, typename idx_t, typename nnz_t>
+void coo_remove_scalar(COO<T, idx_t, nnz_t>* in,
+                       COO<T, idx_t, nnz_t>* out,
+                       T scalar,
+                       cudaStream_t stream)
 {
-  detail::coo_remove_scalar<128, T>(in, out, scalar, stream);
+  detail::coo_remove_scalar<128, T, idx_t, nnz_t>(in, out, scalar, stream);
 }
 
 /**
@@ -81,10 +84,10 @@ void coo_remove_scalar(COO<T>* in, COO<T>* out, T scalar, cudaStream_t stream)
  * @param out: output COO matrix
  * @param stream: cuda stream to use
  */
-template <typename T>
-void coo_remove_zeros(COO<T>* in, COO<T>* out, cudaStream_t stream)
+template <typename T, typename idx_t, typename nnz_t>
+void coo_remove_zeros(COO<T, idx_t, nnz_t>* in, COO<T, idx_t, nnz_t>* out, cudaStream_t stream)
 {
-  coo_remove_scalar<T>(in, out, T(0.0), stream);
+  coo_remove_scalar<T, idx_t, nnz_t>(in, out, T(0.0), stream);
 }
 
 };  // namespace op
diff --git a/cpp/include/raft/sparse/op/reduce.cuh b/cpp/include/raft/sparse/op/reduce.cuh
index b03192f111..102e864943 100644
--- a/cpp/include/raft/sparse/op/reduce.cuh
+++ b/cpp/include/raft/sparse/op/reduce.cuh
@@ -68,15 +68,15 @@ void compute_duplicates_mask(
  * @param[in] m number of rows in COO input matrix
  * @param[in] n number of columns in COO input matrix
  */
-template <typename value_idx, typename value_t>
+template <typename value_idx, typename value_t, typename nnz_t>
 void max_duplicates(raft::resources const& handle,
-                    raft::sparse::COO<value_t, value_idx>& out,
+                    raft::sparse::COO<value_t, value_idx, nnz_t>& out,
                     const value_idx* rows,
                     const value_idx* cols,
                     const value_t* vals,
-                    size_t nnz,
-                    size_t m,
-                    size_t n)
+                    nnz_t nnz,
+                    value_idx m,
+                    value_idx n)
 {
   detail::max_duplicates(handle, out, rows, cols, vals, nnz, m, n);
 }
diff --git a/cpp/include/raft/sparse/op/sort.cuh b/cpp/include/raft/sparse/op/sort.cuh
index 5b8a792429..62231e561e 100644
--- a/cpp/include/raft/sparse/op/sort.cuh
+++ b/cpp/include/raft/sparse/op/sort.cuh
@@ -37,8 +37,8 @@ namespace op {
  * @param vals vals array from coo matrix
  * @param stream: cuda stream to use
  */
-template <typename T, typename IdxT = int>
-void coo_sort(IdxT m, IdxT n, IdxT nnz, IdxT* rows, IdxT* cols, T* vals, cudaStream_t stream)
+template <typename T, typename IdxT = int, typename nnz_t>
+void coo_sort(IdxT m, IdxT n, nnz_t nnz, IdxT* rows, IdxT* cols, T* vals, cudaStream_t stream)
 {
   detail::coo_sort(m, n, nnz, rows, cols, vals, stream);
 }
@@ -49,10 +49,11 @@ void coo_sort(IdxT m, IdxT n, IdxT nnz, IdxT* rows, IdxT* cols, T* vals, cudaStr
  * @param in: COO to sort by row
  * @param stream: the cuda stream to use
  */
-template <typename T, typename IdxT = int>
-void coo_sort(COO<T, IdxT>* const in, cudaStream_t stream)
+template <typename T, typename IdxT = int, typename nnz_t>
+void coo_sort(COO<T, IdxT, nnz_t>* const in, cudaStream_t stream)
 {
-  coo_sort<T, IdxT>(in->n_rows, in->n_cols, in->nnz, in->rows(), in->cols(), in->vals(), stream);
+  coo_sort<T, IdxT, nnz_t>(
+    in->n_rows, in->n_cols, in->nnz, in->rows(), in->cols(), in->vals(), stream);
 }
 
 /**
@@ -65,9 +66,9 @@ void coo_sort(COO<T, IdxT>* const in, cudaStream_t stream)
  * @param[in] nnz number of edges in edge list
  * @param[in] stream cuda stream for which to order cuda operations
  */
-template <typename value_idx, typename value_t>
+template <typename value_idx, typename value_t, typename nnz_t>
 void coo_sort_by_weight(
-  value_idx* rows, value_idx* cols, value_t* data, value_idx nnz, cudaStream_t stream)
+  value_idx* rows, value_idx* cols, value_t* data, nnz_t nnz, cudaStream_t stream)
 {
   detail::coo_sort_by_weight(rows, cols, data, nnz, stream);
 }
diff --git a/cpp/include/raft/sparse/solver/detail/lanczos.cuh b/cpp/include/raft/sparse/solver/detail/lanczos.cuh
index ddfa01731a..71274333e8 100644
--- a/cpp/include/raft/sparse/solver/detail/lanczos.cuh
+++ b/cpp/include/raft/sparse/solver/detail/lanczos.cuh
@@ -123,18 +123,19 @@ inline curandStatus_t curandGenerateNormalX(
  *    Workspace. Not needed if full reorthogonalization is disabled.
  *  @return Zero if successful. Otherwise non-zero.
  */
-template <typename index_type_t, typename value_type_t>
-int performLanczosIteration(raft::resources const& handle,
-                            spectral::matrix::sparse_matrix_t<index_type_t, value_type_t> const* A,
-                            index_type_t* iter,
-                            index_type_t maxIter,
-                            value_type_t shift,
-                            value_type_t tol,
-                            bool reorthogonalize,
-                            value_type_t* __restrict__ alpha_host,
-                            value_type_t* __restrict__ beta_host,
-                            value_type_t* __restrict__ lanczosVecs_dev,
-                            value_type_t* __restrict__ work_dev)
+template <typename index_type_t, typename value_type_t, typename nnz_type_t>
+int performLanczosIteration(
+  raft::resources const& handle,
+  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t, nnz_type_t> const* A,
+  index_type_t* iter,
+  index_type_t maxIter,
+  value_type_t shift,
+  value_type_t tol,
+  bool reorthogonalize,
+  value_type_t* __restrict__ alpha_host,
+  value_type_t* __restrict__ beta_host,
+  value_type_t* __restrict__ lanczosVecs_dev,
+  value_type_t* __restrict__ work_dev)
 {
   // -------------------------------------------------------
   // Variable declaration
@@ -151,7 +152,7 @@ int performLanczosIteration(raft::resources const& handle,
 
   RAFT_EXPECTS(A != nullptr, "Null matrix pointer.");
 
-  index_type_t n = A->nrows_;
+  nnz_type_t n = A->nrows_;
 
   // -------------------------------------------------------
   // Compute second Lanczos vector
@@ -789,10 +790,10 @@ static int lanczosRestart(raft::resources const& handle,
  *  @param seed random seed.
  *  @return error flag.
  */
-template <typename index_type_t, typename value_type_t>
+template <typename index_type_t, typename value_type_t, typename nnz_type_t>
 int computeSmallestEigenvectors(
   raft::resources const& handle,
-  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t> const* A,
+  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t, nnz_type_t> const* A,
   index_type_t nEigVecs,
   index_type_t maxIter,
   index_type_t restartIter,
@@ -814,7 +815,7 @@ int computeSmallestEigenvectors(
   constexpr value_type_t zero = 0;
 
   // Matrix dimension
-  index_type_t n = A->nrows_;
+  nnz_type_t n = A->nrows_;
 
   // Shift for implicit restart
   value_type_t shiftUpper;
@@ -836,7 +837,8 @@ int computeSmallestEigenvectors(
   // -------------------------------------------------------
   // Check that parameters are valid
   // -------------------------------------------------------
-  RAFT_EXPECTS(nEigVecs > 0 && nEigVecs <= n, "Invalid number of eigenvectors.");
+  RAFT_EXPECTS(nEigVecs > 0 && static_cast<nnz_type_t>(nEigVecs) <= n,
+               "Invalid number of eigenvectors.");
   RAFT_EXPECTS(restartIter > 0, "Invalid restartIter.");
   RAFT_EXPECTS(tol > 0, "Invalid tolerance.");
   RAFT_EXPECTS(maxIter >= nEigVecs, "Invalid maxIter.");
@@ -887,17 +889,17 @@ int computeSmallestEigenvectors(
   // Obtain tridiagonal matrix with Lanczos
   *effIter = 0;
   *shift   = 0;
-  status   = performLanczosIteration<index_type_t, value_type_t>(handle,
-                                                               A,
-                                                               effIter,
-                                                               maxIter_curr,
-                                                               *shift,
-                                                               0.0,
-                                                               reorthogonalize,
-                                                               alpha_host,
-                                                               beta_host,
-                                                               lanczosVecs_dev,
-                                                               work_dev);
+  status   = performLanczosIteration<index_type_t, value_type_t, nnz_type_t>(handle,
+                                                                           A,
+                                                                           effIter,
+                                                                           maxIter_curr,
+                                                                           *shift,
+                                                                           0.0,
+                                                                           reorthogonalize,
+                                                                           alpha_host,
+                                                                           beta_host,
+                                                                           lanczosVecs_dev,
+                                                                           work_dev);
   if (status) WARNING("error in Lanczos iteration");
 
   // Determine largest eigenvalue
@@ -912,17 +914,17 @@ int computeSmallestEigenvectors(
   // Obtain tridiagonal matrix with Lanczos
   *effIter = 0;
 
-  status = performLanczosIteration<index_type_t, value_type_t>(handle,
-                                                               A,
-                                                               effIter,
-                                                               maxIter_curr,
-                                                               *shift,
-                                                               0,
-                                                               reorthogonalize,
-                                                               alpha_host,
-                                                               beta_host,
-                                                               lanczosVecs_dev,
-                                                               work_dev);
+  status = performLanczosIteration<index_type_t, value_type_t, nnz_type_t>(handle,
+                                                                           A,
+                                                                           effIter,
+                                                                           maxIter_curr,
+                                                                           *shift,
+                                                                           0,
+                                                                           reorthogonalize,
+                                                                           alpha_host,
+                                                                           beta_host,
+                                                                           lanczosVecs_dev,
+                                                                           work_dev);
   if (status) WARNING("error in Lanczos iteration");
   *totalIter += *effIter;
 
@@ -960,17 +962,17 @@ int computeSmallestEigenvectors(
 
     // Proceed with Lanczos method
 
-    status = performLanczosIteration<index_type_t, value_type_t>(handle,
-                                                                 A,
-                                                                 effIter,
-                                                                 maxIter_curr,
-                                                                 *shift,
-                                                                 tol * fabs(shiftLower),
-                                                                 reorthogonalize,
-                                                                 alpha_host,
-                                                                 beta_host,
-                                                                 lanczosVecs_dev,
-                                                                 work_dev);
+    status = performLanczosIteration<index_type_t, value_type_t, nnz_type_t>(handle,
+                                                                             A,
+                                                                             effIter,
+                                                                             maxIter_curr,
+                                                                             *shift,
+                                                                             tol * fabs(shiftLower),
+                                                                             reorthogonalize,
+                                                                             alpha_host,
+                                                                             beta_host,
+                                                                             lanczosVecs_dev,
+                                                                             work_dev);
     if (status) WARNING("error in Lanczos iteration");
     *totalIter += *effIter - iter_new;
   }
@@ -1033,10 +1035,10 @@ int computeSmallestEigenvectors(
   return 0;
 }
 
-template <typename index_type_t, typename value_type_t>
+template <typename index_type_t, typename value_type_t, typename nnz_type_t>
 int computeSmallestEigenvectors(
   raft::resources const& handle,
-  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t> const& A,
+  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t, nnz_type_t> const& A,
   index_type_t nEigVecs,
   index_type_t maxIter,
   index_type_t restartIter,
@@ -1136,10 +1138,10 @@ int computeSmallestEigenvectors(
  *  @param seed random seed.
  *  @return error flag.
  */
-template <typename index_type_t, typename value_type_t>
+template <typename index_type_t, typename value_type_t, typename nnz_type_t>
 int computeLargestEigenvectors(
   raft::resources const& handle,
-  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t> const* A,
+  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t, nnz_type_t> const* A,
   index_type_t nEigVecs,
   index_type_t maxIter,
   index_type_t restartIter,
@@ -1160,7 +1162,7 @@ int computeLargestEigenvectors(
   constexpr value_type_t zero = 0;
 
   // Matrix dimension
-  index_type_t n = A->nrows_;
+  nnz_type_t n = A->nrows_;
 
   // Lanczos iteration counters
   index_type_t maxIter_curr = restartIter;  // Maximum size of Lanczos system
@@ -1183,7 +1185,8 @@ int computeLargestEigenvectors(
   // -------------------------------------------------------
   // Check that parameters are valid
   // -------------------------------------------------------
-  RAFT_EXPECTS(nEigVecs > 0 && nEigVecs <= n, "Invalid number of eigenvectors.");
+  RAFT_EXPECTS(nEigVecs > 0 && static_cast<nnz_type_t>(nEigVecs) <= n,
+               "Invalid number of eigenvectors.");
   RAFT_EXPECTS(restartIter > 0, "Invalid restartIter.");
   RAFT_EXPECTS(tol > 0, "Invalid tolerance.");
   RAFT_EXPECTS(maxIter >= nEigVecs, "Invalid maxIter.");
@@ -1234,17 +1237,17 @@ int computeLargestEigenvectors(
   value_type_t shift_val = 0.0;
   value_type_t* shift    = &shift_val;
 
-  status = performLanczosIteration<index_type_t, value_type_t>(handle,
-                                                               A,
-                                                               effIter,
-                                                               maxIter_curr,
-                                                               *shift,
-                                                               0,
-                                                               reorthogonalize,
-                                                               alpha_host,
-                                                               beta_host,
-                                                               lanczosVecs_dev,
-                                                               work_dev);
+  status = performLanczosIteration<index_type_t, value_type_t, nnz_type_t>(handle,
+                                                                           A,
+                                                                           effIter,
+                                                                           maxIter_curr,
+                                                                           *shift,
+                                                                           0,
+                                                                           reorthogonalize,
+                                                                           alpha_host,
+                                                                           beta_host,
+                                                                           lanczosVecs_dev,
+                                                                           work_dev);
   if (status) WARNING("error in Lanczos iteration");
   *totalIter += *effIter;
 
@@ -1282,17 +1285,17 @@ int computeLargestEigenvectors(
 
     // Proceed with Lanczos method
 
-    status = performLanczosIteration<index_type_t, value_type_t>(handle,
-                                                                 A,
-                                                                 effIter,
-                                                                 maxIter_curr,
-                                                                 *shift,
-                                                                 tol * fabs(shiftLower),
-                                                                 reorthogonalize,
-                                                                 alpha_host,
-                                                                 beta_host,
-                                                                 lanczosVecs_dev,
-                                                                 work_dev);
+    status = performLanczosIteration<index_type_t, value_type_t, nnz_type_t>(handle,
+                                                                             A,
+                                                                             effIter,
+                                                                             maxIter_curr,
+                                                                             *shift,
+                                                                             tol * fabs(shiftLower),
+                                                                             reorthogonalize,
+                                                                             alpha_host,
+                                                                             beta_host,
+                                                                             lanczosVecs_dev,
+                                                                             work_dev);
     if (status) WARNING("error in Lanczos iteration");
     *totalIter += *effIter - iter_new;
   }
@@ -1383,10 +1386,10 @@ int computeLargestEigenvectors(
   return 0;
 }
 
-template <typename index_type_t, typename value_type_t>
+template <typename index_type_t, typename value_type_t, typename nnz_type_t>
 int computeLargestEigenvectors(
   raft::resources const& handle,
-  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t> const& A,
+  spectral::matrix::sparse_matrix_t<index_type_t, value_type_t, nnz_type_t> const& A,
   index_type_t nEigVecs,
   index_type_t maxIter,
   index_type_t restartIter,
diff --git a/cpp/include/raft/sparse/solver/lanczos.cuh b/cpp/include/raft/sparse/solver/lanczos.cuh
index 4c45a28cc6..0617dc71da 100644
--- a/cpp/include/raft/sparse/solver/lanczos.cuh
+++ b/cpp/include/raft/sparse/solver/lanczos.cuh
@@ -137,10 +137,10 @@ auto lanczos_compute_smallest_eigenvectors(
  *  @param seed random seed.
  *  @return error flag.
  */
-template <typename index_type_t, typename value_type_t>
+template <typename index_type_t, typename value_type_t, typename nnz_type_t>
 int computeSmallestEigenvectors(
   raft::resources const& handle,
-  raft::spectral::matrix::sparse_matrix_t<index_type_t, value_type_t> const& A,
+  raft::spectral::matrix::sparse_matrix_t<index_type_t, value_type_t, nnz_type_t> const& A,
   index_type_t nEigVecs,
   index_type_t maxIter,
   index_type_t restartIter,
@@ -201,10 +201,10 @@ int computeSmallestEigenvectors(
  *  @param seed random seed.
  *  @return error flag.
  */
-template <typename index_type_t, typename value_type_t>
+template <typename index_type_t, typename value_type_t, typename nnz_type_t>
 int computeLargestEigenvectors(
   raft::resources const& handle,
-  raft::spectral::matrix::sparse_matrix_t<index_type_t, value_type_t> const& A,
+  raft::spectral::matrix::sparse_matrix_t<index_type_t, value_type_t, nnz_type_t> const& A,
   index_type_t nEigVecs,
   index_type_t maxIter,
   index_type_t restartIter,
diff --git a/cpp/include/raft/spectral/detail/matrix_wrappers.hpp b/cpp/include/raft/spectral/detail/matrix_wrappers.hpp
index db8a5dc9ef..af4a838bd5 100644
--- a/cpp/include/raft/spectral/detail/matrix_wrappers.hpp
+++ b/cpp/include/raft/spectral/detail/matrix_wrappers.hpp
@@ -134,7 +134,7 @@ class vector_t {
   const thrust_exec_policy_t thrust_policy;
 };
 
-template <typename index_type, typename value_type>
+template <typename index_type, typename value_type, typename nnz_type = uint64_t>
 struct sparse_matrix_t {
   sparse_matrix_t(resources const& raft_handle,
                   index_type const* row_offsets,
@@ -142,7 +142,7 @@ struct sparse_matrix_t {
                   value_type const* values,
                   index_type const nrows,
                   index_type const ncols,
-                  index_type const nnz)
+                  nnz_type const nnz)
     : handle_(raft_handle),
       row_offsets_(row_offsets),
       col_indices_(col_indices),
@@ -158,7 +158,7 @@ struct sparse_matrix_t {
                   index_type const* col_indices,
                   value_type const* values,
                   index_type const nrows,
-                  index_type const nnz)
+                  nnz_type const nnz)
     : handle_(raft_handle),
       row_offsets_(row_offsets),
       col_indices_(col_indices),
@@ -311,18 +311,18 @@ struct sparse_matrix_t {
   value_type const* values_;
   index_type const nrows_;
   index_type const ncols_;
-  index_type const nnz_;
+  nnz_type const nnz_;
 };
 
-template <typename index_type, typename value_type>
-struct laplacian_matrix_t : sparse_matrix_t<index_type, value_type> {
+template <typename index_type, typename value_type, typename nnz_type = uint64_t>
+struct laplacian_matrix_t : sparse_matrix_t<index_type, value_type, nnz_type> {
   laplacian_matrix_t(resources const& raft_handle,
                      index_type const* row_offsets,
                      index_type const* col_indices,
                      value_type const* values,
                      index_type const nrows,
-                     index_type const nnz)
-    : sparse_matrix_t<index_type, value_type>(
+                     nnz_type const nnz)
+    : sparse_matrix_t<index_type, value_type, nnz_type>(
         raft_handle, row_offsets, col_indices, values, nrows, nnz),
       diagonal_(raft_handle, nrows)
   {
@@ -332,18 +332,18 @@ struct laplacian_matrix_t : sparse_matrix_t<index_type, value_type> {
   }
 
   laplacian_matrix_t(resources const& raft_handle,
-                     sparse_matrix_t<index_type, value_type> const& csr_m)
-    : sparse_matrix_t<index_type, value_type>(raft_handle,
-                                              csr_m.row_offsets_,
-                                              csr_m.col_indices_,
-                                              csr_m.values_,
-                                              csr_m.nrows_,
-                                              csr_m.nnz_),
+                     sparse_matrix_t<index_type, value_type, nnz_type> const& csr_m)
+    : sparse_matrix_t<index_type, value_type, nnz_type>(raft_handle,
+                                                        csr_m.row_offsets_,
+                                                        csr_m.col_indices_,
+                                                        csr_m.values_,
+                                                        csr_m.nrows_,
+                                                        csr_m.nnz_),
       diagonal_(raft_handle, csr_m.nrows_)
   {
     vector_t<value_type> ones{raft_handle, (size_t)csr_m.nrows_};
     ones.fill(1.0);
-    sparse_matrix_t<index_type, value_type>::mv(1, ones.raw(), 0, diagonal_.raw());
+    sparse_matrix_t<index_type, value_type, nnz_type>::mv(1, ones.raw(), 0, diagonal_.raw());
   }
 
   // y = alpha*A*x + beta*y
@@ -357,9 +357,9 @@ struct laplacian_matrix_t : sparse_matrix_t<index_type, value_type> {
           bool symmetric      = false) const override
   {
     constexpr int BLOCK_SIZE = 1024;
-    auto n                   = sparse_matrix_t<index_type, value_type>::nrows_;
+    auto n                   = sparse_matrix_t<index_type, value_type, nnz_type>::nrows_;
 
-    auto handle   = sparse_matrix_t<index_type, value_type>::get_handle();
+    auto handle   = sparse_matrix_t<index_type, value_type, nnz_type>::get_handle();
     auto cublas_h = resource::get_cublas_handle(handle);
     auto stream   = resource::get_cuda_stream(handle);
 
@@ -382,31 +382,32 @@ struct laplacian_matrix_t : sparse_matrix_t<index_type, value_type> {
 
     // Apply adjacency matrix
     //
-    sparse_matrix_t<index_type, value_type>::mv(-alpha, x, 1, y, alg, transpose, symmetric);
+    sparse_matrix_t<index_type, value_type, nnz_type>::mv(
+      -alpha, x, 1, y, alg, transpose, symmetric);
   }
 
   vector_t<value_type> diagonal_;
 };
 
-template <typename index_type, typename value_type>
-struct modularity_matrix_t : laplacian_matrix_t<index_type, value_type> {
+template <typename index_type, typename value_type, typename nnz_type = uint64_t>
+struct modularity_matrix_t : laplacian_matrix_t<index_type, value_type, nnz_type> {
   modularity_matrix_t(resources const& raft_handle,
                       index_type const* row_offsets,
                       index_type const* col_indices,
                       value_type const* values,
                       index_type const nrows,
-                      index_type const nnz)
-    : laplacian_matrix_t<index_type, value_type>(
+                      nnz_type const nnz)
+    : laplacian_matrix_t<index_type, value_type, nnz_type>(
         raft_handle, row_offsets, col_indices, values, nrows, nnz)
   {
-    edge_sum_ = laplacian_matrix_t<index_type, value_type>::diagonal_.nrm1();
+    edge_sum_ = laplacian_matrix_t<index_type, value_type, nnz_type>::diagonal_.nrm1();
   }
 
   modularity_matrix_t(resources const& raft_handle,
-                      sparse_matrix_t<index_type, value_type> const& csr_m)
-    : laplacian_matrix_t<index_type, value_type>(raft_handle, csr_m)
+                      sparse_matrix_t<index_type, value_type, nnz_type> const& csr_m)
+    : laplacian_matrix_t<index_type, value_type, nnz_type>(raft_handle, csr_m)
   {
-    edge_sum_ = laplacian_matrix_t<index_type, value_type>::diagonal_.nrm1();
+    edge_sum_ = laplacian_matrix_t<index_type, value_type, nnz_type>::diagonal_.nrm1();
   }
 
   // y = alpha*A*x + beta*y
@@ -419,44 +420,45 @@ struct modularity_matrix_t : laplacian_matrix_t<index_type, value_type> {
           bool transpose      = false,
           bool symmetric      = false) const override
   {
-    auto n = sparse_matrix_t<index_type, value_type>::nrows_;
+    auto n = sparse_matrix_t<index_type, value_type, nnz_type>::nrows_;
 
-    auto handle   = sparse_matrix_t<index_type, value_type>::get_handle();
+    auto handle   = sparse_matrix_t<index_type, value_type, nnz_type>::get_handle();
     auto cublas_h = resource::get_cublas_handle(handle);
     auto stream   = resource::get_cuda_stream(handle);
 
     // y = A*x
     //
-    sparse_matrix_t<index_type, value_type>::mv(alpha, x, 0, y, alg, transpose, symmetric);
+    sparse_matrix_t<index_type, value_type, nnz_type>::mv(
+      alpha, x, 0, y, alg, transpose, symmetric);
     value_type dot_res;
 
     // gamma = d'*x
     //
     // Cublas::dot(this->n, D.raw(), 1, x, 1, &dot_res);
     // TODO: Call from public API when ready
-    RAFT_CUBLAS_TRY(
-      raft::linalg::detail::cublasdot(cublas_h,
-                                      n,
-                                      laplacian_matrix_t<index_type, value_type>::diagonal_.raw(),
-                                      1,
-                                      x,
-                                      1,
-                                      &dot_res,
-                                      stream));
+    RAFT_CUBLAS_TRY(raft::linalg::detail::cublasdot(
+      cublas_h,
+      n,
+      laplacian_matrix_t<index_type, value_type, nnz_type>::diagonal_.raw(),
+      1,
+      x,
+      1,
+      &dot_res,
+      stream));
 
     // y = y -(gamma/edge_sum)*d
     //
     value_type gamma_ = -dot_res / edge_sum_;
     // TODO: Call from public API when ready
-    RAFT_CUBLAS_TRY(
-      raft::linalg::detail::cublasaxpy(cublas_h,
-                                       n,
-                                       &gamma_,
-                                       laplacian_matrix_t<index_type, value_type>::diagonal_.raw(),
-                                       1,
-                                       y,
-                                       1,
-                                       stream));
+    RAFT_CUBLAS_TRY(raft::linalg::detail::cublasaxpy(
+      cublas_h,
+      n,
+      &gamma_,
+      laplacian_matrix_t<index_type, value_type, nnz_type>::diagonal_.raw(),
+      1,
+      y,
+      1,
+      stream));
   }
 
   value_type edge_sum_;
diff --git a/cpp/include/raft/spectral/detail/partition.hpp b/cpp/include/raft/spectral/detail/partition.hpp
index f5fc40aad6..26e4a73f9d 100644
--- a/cpp/include/raft/spectral/detail/partition.hpp
+++ b/cpp/include/raft/spectral/detail/partition.hpp
@@ -63,10 +63,14 @@ namespace detail {
  *    performed.
  *  @return statistics: number of eigensolver iterations, .
  */
-template <typename vertex_t, typename weight_t, typename EigenSolver, typename ClusterSolver>
+template <typename vertex_t,
+          typename weight_t,
+          typename nnz_t,
+          typename EigenSolver,
+          typename ClusterSolver>
 std::tuple<vertex_t, weight_t, vertex_t> partition(
   raft::resources const& handle,
-  spectral::matrix::sparse_matrix_t<vertex_t, weight_t> const& csr_m,
+  spectral::matrix::sparse_matrix_t<vertex_t, weight_t, nnz_t> const& csr_m,
   EigenSolver const& eigen_solver,
   ClusterSolver const& cluster_solver,
   vertex_t* __restrict__ clusters,
@@ -94,7 +98,7 @@ std::tuple<vertex_t, weight_t, vertex_t> partition(
 
   // Initialize Laplacian
   /// sparse_matrix_t<vertex_t, weight_t> A{handle, graph};
-  spectral::matrix::laplacian_matrix_t<vertex_t, weight_t> L{handle, csr_m};
+  spectral::matrix::laplacian_matrix_t<vertex_t, weight_t, nnz_t> L{handle, csr_m};
 
   auto eigen_config = eigen_solver.get_config();
   auto nEigVecs     = eigen_config.n_eigVecs;
@@ -132,9 +136,9 @@ std::tuple<vertex_t, weight_t, vertex_t> partition(
  *  @param cost On exit, partition cost function.
  *  @return error flag.
  */
-template <typename vertex_t, typename weight_t>
+template <typename vertex_t, typename weight_t, typename nnz_t>
 void analyzePartition(raft::resources const& handle,
-                      spectral::matrix::sparse_matrix_t<vertex_t, weight_t> const& csr_m,
+                      spectral::matrix::sparse_matrix_t<vertex_t, weight_t, nnz_t> const& csr_m,
                       vertex_t nClusters,
                       const vertex_t* __restrict__ clusters,
                       weight_t& edgeCut,
@@ -160,7 +164,7 @@ void analyzePartition(raft::resources const& handle,
 
   // Initialize Laplacian
   /// sparse_matrix_t<vertex_t, weight_t> A{handle, graph};
-  spectral::matrix::laplacian_matrix_t<vertex_t, weight_t> L{handle, csr_m};
+  spectral::matrix::laplacian_matrix_t<vertex_t, weight_t, nnz_t> L{handle, csr_m};
 
   // Initialize output
   cost    = 0;
diff --git a/cpp/include/raft/spectral/detail/spectral_util.cuh b/cpp/include/raft/spectral/detail/spectral_util.cuh
index 002fad9680..9bbc8878fe 100644
--- a/cpp/include/raft/spectral/detail/spectral_util.cuh
+++ b/cpp/include/raft/spectral/detail/spectral_util.cuh
@@ -133,16 +133,17 @@ struct equal_to_i_op {
 
 // Construct indicator vector for ith partition
 //
-template <typename vertex_t, typename edge_t, typename weight_t>
-bool construct_indicator(raft::resources const& handle,
-                         edge_t index,
-                         edge_t n,
-                         weight_t& clustersize,
-                         weight_t& partStats,
-                         vertex_t const* __restrict__ clusters,
-                         raft::spectral::matrix::vector_t<weight_t>& part_i,
-                         raft::spectral::matrix::vector_t<weight_t>& Bx,
-                         raft::spectral::matrix::laplacian_matrix_t<vertex_t, weight_t> const& B)
+template <typename vertex_t, typename edge_t, typename weight_t, typename nnz_t>
+bool construct_indicator(
+  raft::resources const& handle,
+  edge_t index,
+  edge_t n,
+  weight_t& clustersize,
+  weight_t& partStats,
+  vertex_t const* __restrict__ clusters,
+  raft::spectral::matrix::vector_t<weight_t>& part_i,
+  raft::spectral::matrix::vector_t<weight_t>& Bx,
+  raft::spectral::matrix::laplacian_matrix_t<vertex_t, weight_t, nnz_t> const& B)
 {
   auto stream             = resource::get_cuda_stream(handle);
   auto cublas_h           = resource::get_cublas_handle(handle);
diff --git a/cpp/include/raft/spectral/eigen_solvers.cuh b/cpp/include/raft/spectral/eigen_solvers.cuh
index 324f16ac7b..03448e2b5e 100644
--- a/cpp/include/raft/spectral/eigen_solvers.cuh
+++ b/cpp/include/raft/spectral/eigen_solvers.cuh
@@ -51,7 +51,7 @@ struct lanczos_solver_t {
 
   index_type_t solve_smallest_eigenvectors(
     raft::resources const& handle,
-    matrix::sparse_matrix_t<index_type_t, value_type_t> const& A,
+    matrix::sparse_matrix_t<index_type_t, value_type_t, size_type_t> const& A,
     value_type_t* __restrict__ eigVals,
     value_type_t* __restrict__ eigVecs) const
   {
@@ -75,7 +75,7 @@ struct lanczos_solver_t {
 
   index_type_t solve_largest_eigenvectors(
     raft::resources const& handle,
-    matrix::sparse_matrix_t<index_type_t, value_type_t> const& A,
+    matrix::sparse_matrix_t<index_type_t, value_type_t, size_type_t> const& A,
     value_type_t* __restrict__ eigVals,
     value_type_t* __restrict__ eigVecs) const
   {
diff --git a/cpp/include/raft/spectral/partition.cuh b/cpp/include/raft/spectral/partition.cuh
index a2ac328aa1..319ef0ccd1 100644
--- a/cpp/include/raft/spectral/partition.cuh
+++ b/cpp/include/raft/spectral/partition.cuh
@@ -45,17 +45,21 @@ namespace spectral {
  *  @param eigVecs Output eigenvector array pointer on device
  *  @return statistics: number of eigensolver iterations, .
  */
-template <typename vertex_t, typename weight_t, typename EigenSolver, typename ClusterSolver>
+template <typename vertex_t,
+          typename weight_t,
+          typename nnz_t,
+          typename EigenSolver,
+          typename ClusterSolver>
 std::tuple<vertex_t, weight_t, vertex_t> partition(
   raft::resources const& handle,
-  matrix::sparse_matrix_t<vertex_t, weight_t> const& csr_m,
+  matrix::sparse_matrix_t<vertex_t, weight_t, nnz_t> const& csr_m,
   EigenSolver const& eigen_solver,
   ClusterSolver const& cluster_solver,
   vertex_t* __restrict__ clusters,
   weight_t* eigVals,
   weight_t* eigVecs)
 {
-  return raft::spectral::detail::partition<vertex_t, weight_t, EigenSolver, ClusterSolver>(
+  return raft::spectral::detail::partition<vertex_t, weight_t, nnz_t, EigenSolver, ClusterSolver>(
     handle, csr_m, eigen_solver, cluster_solver, clusters, eigVals, eigVecs);
 }
 
@@ -77,15 +81,15 @@ std::tuple<vertex_t, weight_t, vertex_t> partition(
  *  @param edgeCut On exit, weight of edges cut by partition.
  *  @param cost On exit, partition cost function.
  */
-template <typename vertex_t, typename weight_t>
+template <typename vertex_t, typename weight_t, typename nnz_t>
 void analyzePartition(raft::resources const& handle,
-                      matrix::sparse_matrix_t<vertex_t, weight_t> const& csr_m,
+                      matrix::sparse_matrix_t<vertex_t, weight_t, nnz_t> const& csr_m,
                       vertex_t nClusters,
                       const vertex_t* __restrict__ clusters,
                       weight_t& edgeCut,
                       weight_t& cost)
 {
-  raft::spectral::detail::analyzePartition<vertex_t, weight_t>(
+  raft::spectral::detail::analyzePartition<vertex_t, weight_t, nnz_t>(
     handle, csr_m, nClusters, clusters, edgeCut, cost);
 }
 
diff --git a/cpp/tests/linalg/eigen_solvers.cu b/cpp/tests/linalg/eigen_solvers.cu
index cf75ff89bf..250deb6f33 100644
--- a/cpp/tests/linalg/eigen_solvers.cu
+++ b/cpp/tests/linalg/eigen_solvers.cu
@@ -36,6 +36,7 @@ TEST(Raft, EigenSolvers)
   using namespace matrix;
   using index_type = int;
   using value_type = double;
+  using nnz_type   = int;
 
   raft::resources h;
   ASSERT_EQ(0, resource::get_device_id(h));
@@ -46,7 +47,7 @@ TEST(Raft, EigenSolvers)
   index_type nnz   = 0;
   index_type nrows = 0;
 
-  sparse_matrix_t<index_type, value_type> sm1{h, ro, ci, vs, nrows, nnz};
+  sparse_matrix_t<index_type, value_type, nnz_type> sm1{h, ro, ci, vs, nrows, nnz};
   ASSERT_EQ(nullptr, sm1.row_offsets_);
 
   index_type neigvs{10};
@@ -64,7 +65,7 @@ TEST(Raft, EigenSolvers)
   eigen_solver_config_t<index_type, value_type> cfg{
     neigvs, maxiter, restart_iter, tol, reorthog, seed};
 
-  lanczos_solver_t<index_type, value_type> eig_solver{cfg};
+  lanczos_solver_t<index_type, value_type, nnz_type> eig_solver{cfg};
 
   EXPECT_ANY_THROW(eig_solver.solve_smallest_eigenvectors(h, sm1, eigvals, eigvecs));
 
@@ -77,6 +78,7 @@ TEST(Raft, SpectralSolvers)
   using namespace matrix;
   using index_type = int;
   using value_type = double;
+  using nnz_type   = int;
 
   raft::resources h;
   ASSERT_EQ(0, resource::get_device_id(h)
@@ -99,14 +101,14 @@ TEST(Raft, SpectralSolvers)
 
   eigen_solver_config_t<index_type, value_type> eig_cfg{
     neigvs, maxiter, restart_iter, tol, reorthog, seed};
-  lanczos_solver_t<index_type, value_type> eig_solver{eig_cfg};
+  lanczos_solver_t<index_type, value_type, nnz_type> eig_solver{eig_cfg};
 
   index_type k{5};
 
   cluster_solver_config_t<index_type, value_type> clust_cfg{k, maxiter, tol, seed};
   kmeans_solver_t<index_type, value_type> cluster_solver{clust_cfg};
 
-  sparse_matrix_t<index_type, value_type> sm{h, nullptr, nullptr, nullptr, 0, 0};
+  sparse_matrix_t<index_type, value_type, nnz_type> sm{h, nullptr, nullptr, nullptr, 0, 0};
   EXPECT_ANY_THROW(
     spectral::partition(h, sm, eig_solver, cluster_solver, clusters, eigvals, eigvecs));
 
diff --git a/cpp/tests/sparse/reduce.cu b/cpp/tests/sparse/reduce.cu
index f777f4781d..eef54f1ebe 100644
--- a/cpp/tests/sparse/reduce.cu
+++ b/cpp/tests/sparse/reduce.cu
@@ -41,8 +41,8 @@ struct SparseReduceInputs {
   std::vector<value_idx> out_cols;
   std::vector<value_t> out_vals;
 
-  size_t m;
-  size_t n;
+  value_idx m;
+  value_idx n;
 };
 
 template <typename value_t, typename value_idx>
@@ -73,15 +73,15 @@ class SparseReduceTest : public ::testing::TestWithParam<SparseReduceInputs<valu
     raft::update_device(out_cols.data(), params.out_cols.data(), params.out_cols.size(), stream);
     raft::update_device(out_vals.data(), params.out_vals.data(), params.out_vals.size(), stream);
 
-    raft::sparse::COO<value_t, value_idx> out(stream);
+    raft::sparse::COO<value_t, value_idx, value_idx> out(stream);
     raft::sparse::op::max_duplicates(handle,
                                      out,
                                      in_rows.data(),
                                      in_cols.data(),
                                      in_vals.data(),
-                                     params.in_rows.size(),
-                                     params.m,
-                                     params.n);
+                                     (value_idx)params.in_rows.size(),
+                                     (value_idx)params.m,
+                                     (value_idx)params.n);
     RAFT_CUDA_TRY(cudaStreamSynchronize(stream));
     ASSERT_TRUE(raft::devArrMatch<value_idx>(
       out_rows.data(), out.rows(), out.nnz, raft::Compare<value_idx>()));
diff --git a/cpp/tests/sparse/solver/lanczos.cu b/cpp/tests/sparse/solver/lanczos.cu
index 128ab73747..3652b811e9 100644
--- a/cpp/tests/sparse/solver/lanczos.cu
+++ b/cpp/tests/sparse/solver/lanczos.cu
@@ -147,7 +147,7 @@ class rmat_lanczos_tests
     raft::device_vector<ValueType, uint32_t, raft::row_major> out_data =
       raft::make_device_vector<ValueType, uint32_t, raft::row_major>(handle, n_edges);
     raft::matrix::fill<ValueType>(handle, out_data.view(), 1.0);
-    raft::sparse::COO<ValueType, IndexType> coo(stream);
+    raft::sparse::COO<ValueType, IndexType, IndexType> coo(stream);
 
     raft::sparse::op::coo_sort<ValueType, int>(n_nodes,
                                                n_nodes,
@@ -161,11 +161,11 @@ class rmat_lanczos_tests
                                                            out_src.data_handle(),
                                                            out_dst.data_handle(),
                                                            out_data.data_handle(),
-                                                           n_edges,
-                                                           n_nodes,
-                                                           n_nodes);
+                                                           (IndexType)n_edges,
+                                                           (IndexType)n_nodes,
+                                                           (IndexType)n_nodes);
 
-    raft::sparse::COO<ValueType, IndexType> symmetric_coo(stream);
+    raft::sparse::COO<ValueType, IndexType, IndexType> symmetric_coo(stream);
     raft::sparse::linalg::symmetrize(
       handle, coo.rows(), coo.cols(), coo.vals(), coo.n_rows, coo.n_cols, coo.nnz, symmetric_coo);
 
@@ -198,7 +198,7 @@ class rmat_lanczos_tests
       symmetric_coo.cols(),
       symmetric_coo.vals(),
       symmetric_coo.n_rows,
-      symmetric_coo.nnz};
+      (uint64_t)symmetric_coo.nnz};
     raft::sparse::solver::lanczos_solver_config<ValueType> config{
       n_components, params.maxiter, params.restartiter, params.tol, rng.seed};
 
diff --git a/cpp/tests/sparse/spectral_matrix.cu b/cpp/tests/sparse/spectral_matrix.cu
index 52f7eff10e..6d52dfb1bb 100644
--- a/cpp/tests/sparse/spectral_matrix.cu
+++ b/cpp/tests/sparse/spectral_matrix.cu
@@ -41,6 +41,7 @@ TEST(Raft, SpectralMatrices)
 {
   using index_type = int;
   using value_type = double;
+  using nnz_type   = uint64_t;
 
   raft::resources h;
   ASSERT_EQ(0, raft::resource::get_device_id(h));
@@ -53,29 +54,33 @@ TEST(Raft, SpectralMatrices)
   index_type* ro{nullptr};
   index_type* ci{nullptr};
   value_type* vs{nullptr};
-  index_type nnz   = 0;
+  nnz_type nnz     = 0;
   index_type nrows = 0;
-  sparse_matrix_t<index_type, value_type> sm1{h, ro, ci, vs, nrows, nnz};
-  sparse_matrix_t<index_type, value_type> sm2{h, csr_v};
+  sparse_matrix_t<index_type, value_type, nnz_type> sm1{h, ro, ci, vs, nrows, nnz};
+  sparse_matrix_t<index_type, value_type, nnz_type> sm2{h, csr_v};
   ASSERT_EQ(nullptr, sm1.row_offsets_);
   ASSERT_EQ(nullptr, sm2.row_offsets_);
 
   auto stream = resource::get_cuda_stream(h);
 
   auto cnstr_lm1 = [&h, ro, ci, vs, nrows, nnz](void) {
-    laplacian_matrix_t<index_type, value_type> lm1{h, ro, ci, vs, nrows, nnz};
+    laplacian_matrix_t<index_type, value_type, nnz_type> lm1{h, ro, ci, vs, nrows, nnz};
   };
   EXPECT_ANY_THROW(cnstr_lm1());  // because of nullptr ptr args
 
-  auto cnstr_lm2 = [&h, &sm2](void) { laplacian_matrix_t<index_type, value_type> lm2{h, sm2}; };
+  auto cnstr_lm2 = [&h, &sm2](void) {
+    laplacian_matrix_t<index_type, value_type, nnz_type> lm2{h, sm2};
+  };
   EXPECT_ANY_THROW(cnstr_lm2());  // because of nullptr ptr args
 
   auto cnstr_mm1 = [&h, ro, ci, vs, nrows, nnz](void) {
-    modularity_matrix_t<index_type, value_type> mm1{h, ro, ci, vs, nrows, nnz};
+    modularity_matrix_t<index_type, value_type, nnz_type> mm1{h, ro, ci, vs, nrows, nnz};
   };
   EXPECT_ANY_THROW(cnstr_mm1());  // because of nullptr ptr args
 
-  auto cnstr_mm2 = [&h, &sm2](void) { modularity_matrix_t<index_type, value_type> mm2{h, sm2}; };
+  auto cnstr_mm2 = [&h, &sm2](void) {
+    modularity_matrix_t<index_type, value_type, nnz_type> mm2{h, sm2};
+  };
   EXPECT_ANY_THROW(cnstr_mm2());  // because of nullptr ptr args
 }
 
diff --git a/cpp/tests/sparse/symmetrize.cu b/cpp/tests/sparse/symmetrize.cu
index e1a74dc40b..358bbfaa83 100644
--- a/cpp/tests/sparse/symmetrize.cu
+++ b/cpp/tests/sparse/symmetrize.cu
@@ -32,9 +32,9 @@
 namespace raft {
 namespace sparse {
 
-template <typename value_idx, typename value_t>
+template <typename value_idx, typename value_t, typename nnz_t>
 RAFT_KERNEL assert_symmetry(
-  value_idx* rows, value_idx* cols, value_t* vals, value_idx nnz, value_idx* sum)
+  value_idx* rows, value_idx* cols, value_t* vals, nnz_t nnz, value_idx* sum)
 {
   int tid = blockDim.x * blockIdx.x + threadIdx.x;
 
@@ -60,7 +60,7 @@ template <typename value_idx, typename value_t>
   return os;
 }
 
-template <typename value_idx, typename value_t>
+template <typename value_idx, typename value_t, typename nnz_t>
 class SparseSymmetrizeTest
   : public ::testing::TestWithParam<SparseSymmetrizeInputs<value_idx, value_t>> {
  public:
@@ -93,15 +93,15 @@ class SparseSymmetrizeTest
   {
     make_data();
 
-    value_idx m   = params.indptr_h.size() - 1;
-    value_idx n   = params.n_cols;
-    value_idx nnz = params.indices_h.size();
+    value_idx m = params.indptr_h.size() - 1;
+    value_idx n = params.n_cols;
+    nnz_t nnz   = params.indices_h.size();
 
     rmm::device_uvector<value_idx> coo_rows(nnz, stream);
 
     raft::sparse::convert::csr_to_coo(indptr.data(), m, coo_rows.data(), nnz, stream);
 
-    raft::sparse::COO<value_t, value_idx> out(stream);
+    raft::sparse::COO<value_t, value_idx, nnz_t> out(stream);
 
     raft::sparse::linalg::symmetrize(
       handle, coo_rows.data(), indices.data(), data.data(), m, n, coo_rows.size(), out);
@@ -109,8 +109,8 @@ class SparseSymmetrizeTest
     rmm::device_scalar<value_idx> sum(stream);
     sum.set_value_to_zero_async(stream);
 
-    assert_symmetry<<<raft::ceildiv(out.nnz, 256), 256, 0, stream>>>(
-      out.rows(), out.cols(), out.vals(), out.nnz, sum.data());
+    assert_symmetry<<<raft::ceildiv(out.nnz, (nnz_t)256), 256, 0, stream>>>(
+      out.rows(), out.cols(), out.vals(), (nnz_t)out.nnz, sum.data());
 
     sum_h = sum.value(stream);
     resource::sync_stream(handle, stream);
@@ -211,7 +211,7 @@ const std::vector<SparseSymmetrizeInputs<int, float>> symm_inputs_fint = {
 
 };
 
-typedef SparseSymmetrizeTest<int, float> SparseSymmetrizeTestF_int;
+typedef SparseSymmetrizeTest<int, float, uint64_t> SparseSymmetrizeTestF_int;
 TEST_P(SparseSymmetrizeTestF_int, Result) { ASSERT_TRUE(sum_h == 0); }
 
 INSTANTIATE_TEST_CASE_P(SparseSymmetrizeTest,

From 21e943da00b53063afccc010f856c14eedf63ee6 Mon Sep 17 00:00:00 2001
From: Kyle Edwards <kyedwards@nvidia.com>
Date: Wed, 12 Feb 2025 13:53:06 -0500
Subject: [PATCH 10/11] Create Conda CI test env in one step (#2580)

Issue: https://github.com/rapidsai/build-planning/issues/22

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

Approvers:
  - Mike Sarahan (https://github.com/msarahan)

URL: https://github.com/rapidsai/raft/pull/2580
---
 ci/release/update-version.sh |  3 +++
 ci/test_cpp.sh               | 18 ++++++------------
 ci/test_python.sh            | 21 +++++++--------------
 dependencies.yaml            | 30 ++++++++++++++++++++++++++++++
 4 files changed, 46 insertions(+), 26 deletions(-)

diff --git a/ci/release/update-version.sh b/ci/release/update-version.sh
index 244f66e99a..75a096f673 100755
--- a/ci/release/update-version.sh
+++ b/ci/release/update-version.sh
@@ -47,6 +47,9 @@ DEPENDENCIES=(
   pylibraft
   rmm
   rapids-dask-dependency
+  libraft-headers
+  raft-dask
+  libraft-tests
 )
 UCXX_DEPENDENCIES=(
   ucx-py
diff --git a/ci/test_cpp.sh b/ci/test_cpp.sh
index 64400858ec..851af74716 100755
--- a/ci/test_cpp.sh
+++ b/ci/test_cpp.sh
@@ -8,13 +8,17 @@ cd "$(dirname "$(realpath "${BASH_SOURCE[0]}")")"/../
 
 . /opt/conda/etc/profile.d/conda.sh
 
-RAPIDS_VERSION="$(rapids-version)"
+CPP_CHANNEL=$(rapids-download-conda-from-s3 cpp)
+RAPIDS_TESTS_DIR=${RAPIDS_TESTS_DIR:-"${PWD}/test-results"}/
+mkdir -p "${RAPIDS_TESTS_DIR}"
 
 rapids-logger "Generate C++ testing dependencies"
 rapids-dependency-file-generator \
   --output conda \
   --file-key test_cpp \
-  --matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch)" | tee env.yaml
+  --matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch)" \
+  --prepend-channel "${CPP_CHANNEL}" \
+  | tee env.yaml
 
 rapids-mamba-retry env create --yes -f env.yaml -n test
 
@@ -23,18 +27,8 @@ set +u
 conda activate test
 set -u
 
-CPP_CHANNEL=$(rapids-download-conda-from-s3 cpp)
-RAPIDS_TESTS_DIR=${RAPIDS_TESTS_DIR:-"${PWD}/test-results"}/
-mkdir -p "${RAPIDS_TESTS_DIR}"
-
 rapids-print-env
 
-rapids-mamba-retry install \
-  --channel "${CPP_CHANNEL}" \
-  "libraft-headers=${RAPIDS_VERSION}" \
-  "libraft=${RAPIDS_VERSION}" \
-  "libraft-tests=${RAPIDS_VERSION}"
-
 rapids-logger "Check GPU usage"
 nvidia-smi
 
diff --git a/ci/test_python.sh b/ci/test_python.sh
index af93d2e04b..0b57173b03 100755
--- a/ci/test_python.sh
+++ b/ci/test_python.sh
@@ -8,13 +8,18 @@ cd "$(dirname "$(realpath "${BASH_SOURCE[0]}")")"/../
 
 . /opt/conda/etc/profile.d/conda.sh
 
-RAPIDS_VERSION="$(rapids-version)"
+rapids-logger "Downloading artifacts from previous jobs"
+CPP_CHANNEL=$(rapids-download-conda-from-s3 cpp)
+PYTHON_CHANNEL=$(rapids-download-conda-from-s3 python)
 
 rapids-logger "Generate Python testing dependencies"
 rapids-dependency-file-generator \
   --output conda \
   --file-key test_python \
-  --matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION}" | tee env.yaml
+  --matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION}" \
+  --prepend-channel "${CPP_CHANNEL}" \
+  --prepend-channel "${PYTHON_CHANNEL}" \
+  | tee env.yaml
 
 rapids-mamba-retry env create --yes -f env.yaml -n test
 
@@ -23,24 +28,12 @@ set +u
 conda activate test
 set -u
 
-rapids-logger "Downloading artifacts from previous jobs"
-CPP_CHANNEL=$(rapids-download-conda-from-s3 cpp)
-PYTHON_CHANNEL=$(rapids-download-conda-from-s3 python)
-
 RAPIDS_TESTS_DIR=${RAPIDS_TESTS_DIR:-"${PWD}/test-results"}
 RAPIDS_COVERAGE_DIR=${RAPIDS_COVERAGE_DIR:-"${PWD}/coverage-results"}
 mkdir -p "${RAPIDS_TESTS_DIR}" "${RAPIDS_COVERAGE_DIR}"
 
 rapids-print-env
 
-rapids-mamba-retry install \
-  --channel "${CPP_CHANNEL}" \
-  --channel "${PYTHON_CHANNEL}" \
-  "libraft=${RAPIDS_VERSION}" \
-  "libraft-headers=${RAPIDS_VERSION}" \
-  "pylibraft=${RAPIDS_VERSION}" \
-  "raft-dask=${RAPIDS_VERSION}"
-
 rapids-logger "Check GPU usage"
 nvidia-smi
 
diff --git a/dependencies.yaml b/dependencies.yaml
index 225103391f..ab2431a1c4 100644
--- a/dependencies.yaml
+++ b/dependencies.yaml
@@ -29,6 +29,9 @@ files:
     includes:
       - cuda_version
       - test_libraft
+      - depends_on_libraft_headers
+      - depends_on_libraft
+      - depends_on_libraft_tests
   test_python:
     output: none
     includes:
@@ -37,6 +40,10 @@ files:
       - py_version
       - test_pylibraft
       - test_python_common
+      - depends_on_libraft
+      - depends_on_libraft_headers
+      - depends_on_pylibraft
+      - depends_on_raft_dask
   checks:
     output: none
     includes:
@@ -545,6 +552,9 @@ dependencies:
           # pip recognizes the index as a global option for the requirements.txt file
           - --extra-index-url=https://pypi.nvidia.com
           - --extra-index-url=https://pypi.anaconda.org/rapidsai-wheels-nightly/simple
+      - output_types: conda
+        packages:
+          - libraft==25.4.*,>=0.0.0a0
     specific:
       - output_types: [requirements, pyproject]
         matrices:
@@ -561,6 +571,26 @@ dependencies:
           - matrix:
             packages:
               - libraft==25.4.*,>=0.0.0a0
+  depends_on_libraft_headers:
+    common:
+      - output_types: conda
+        packages:
+          - libraft-headers==25.4.*,>=0.0.0a0
+  depends_on_pylibraft:
+    common:
+      - output_types: conda
+        packages:
+          - pylibraft==25.4.*,>=0.0.0a0
+  depends_on_raft_dask:
+    common:
+      - output_types: conda
+        packages:
+          - raft-dask==25.4.*,>=0.0.0a0
+  depends_on_libraft_tests:
+    common:
+      - output_types: conda
+        packages:
+          - libraft-tests==25.4.*,>=0.0.0a0
   depends_on_librmm:
     common:
       - output_types: conda

From 7af57c3936313ecb5fab8dc0d758a26eb8f533ca Mon Sep 17 00:00:00 2001
From: Jake Awe <jawe@nvidia.com>
Date: Thu, 13 Feb 2025 09:44:59 -0600
Subject: [PATCH 11/11] Update Changelog [skip ci]

---
 CHANGELOG.md | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 1d7c641b21..a7f1d04beb 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,3 +1,59 @@
+# raft 25.02.00 (13 Feb 2025)
+
+## 🚨 Breaking Changes
+
+- Update pip devcontainers to UCX 1.18 ([#2550](https://github.com/rapidsai/raft/pull/2550)) [@jameslamb](https://github.com/jameslamb)
+- Switch over to rapids-logger ([#2530](https://github.com/rapidsai/raft/pull/2530)) [@vyasr](https://github.com/vyasr)
+- Adapt to rmm logger changes ([#2513](https://github.com/rapidsai/raft/pull/2513)) [@vyasr](https://github.com/vyasr)
+
+## 🐛 Bug Fixes
+
+- Rename test to tests. ([#2546](https://github.com/rapidsai/raft/pull/2546)) [@bdice](https://github.com/bdice)
+- Fix bit order of RMAT Rectangular Generator to match expectation ([#2542](https://github.com/rapidsai/raft/pull/2542)) [@mfoerste4](https://github.com/mfoerste4)
+- Fix broken link to python doc ([#2537](https://github.com/rapidsai/raft/pull/2537)) [@lowener](https://github.com/lowener)
+- Fix lanczos solver integer overflow ([#2536](https://github.com/rapidsai/raft/pull/2536)) [@viclafargue](https://github.com/viclafargue)
+- Fix rnd bit generation in rmat_rectangular_kernel ([#2524](https://github.com/rapidsai/raft/pull/2524)) [@tfeher](https://github.com/tfeher)
+
+## 📖 Documentation
+
+- Fix docs builds ([#2562](https://github.com/rapidsai/raft/pull/2562)) [@bdice](https://github.com/bdice)
+- [DOC] Fix sample codes ([#2518](https://github.com/rapidsai/raft/pull/2518)) [@enp1s0](https://github.com/enp1s0)
+
+## 🚀 New Features
+
+- Add cuda 12.8 support ([#2551](https://github.com/rapidsai/raft/pull/2551)) [@robertmaynard](https://github.com/robertmaynard)
+- Add support for different data type of bitset ([#2535](https://github.com/rapidsai/raft/pull/2535)) [@lowener](https://github.com/lowener)
+- [Feat] Support `bitset_to_csr` ([#2523](https://github.com/rapidsai/raft/pull/2523)) [@rhdong](https://github.com/rhdong)
+- Remove upper bounds on cuda-python to allow 12.6.2 and 11.8.5 ([#2517](https://github.com/rapidsai/raft/pull/2517)) [@bdice](https://github.com/bdice)
+
+## 🛠️ Improvements
+
+- Revert CUDA 12.8 shared workflow branch changes ([#2560](https://github.com/rapidsai/raft/pull/2560)) [@vyasr](https://github.com/vyasr)
+- Build and test with CUDA 12.8.0 ([#2555](https://github.com/rapidsai/raft/pull/2555)) [@bdice](https://github.com/bdice)
+- Update pip devcontainers to UCX 1.18 ([#2550](https://github.com/rapidsai/raft/pull/2550)) [@jameslamb](https://github.com/jameslamb)
+- use dynamic CUDA wheels on CUDA 11 ([#2548](https://github.com/rapidsai/raft/pull/2548)) [@jameslamb](https://github.com/jameslamb)
+- Normalize whitespace ([#2547](https://github.com/rapidsai/raft/pull/2547)) [@bdice](https://github.com/bdice)
+- Use cuda.bindings layout. ([#2545](https://github.com/rapidsai/raft/pull/2545)) [@bdice](https://github.com/bdice)
+- Revert &quot;Introduction of the `raft::device_resources_snmg` type ([#2487)&quot; (#2543](https://github.com/rapidsai/raft/pull/2487)&quot; (#2543)) [@cjnolet](https://github.com/cjnolet)
+- Add missing `#include &lt;cstdint&gt;` ([#2540](https://github.com/rapidsai/raft/pull/2540)) [@jakirkham](https://github.com/jakirkham)
+- Use GCC 13 in CUDA 12 conda builds. ([#2539](https://github.com/rapidsai/raft/pull/2539)) [@bdice](https://github.com/bdice)
+- Use rapids-cmake for the logger ([#2534](https://github.com/rapidsai/raft/pull/2534)) [@vyasr](https://github.com/vyasr)
+- Check if nightlies have succeeded recently enough ([#2533](https://github.com/rapidsai/raft/pull/2533)) [@vyasr](https://github.com/vyasr)
+- remove unused &#39;joblib&#39; and &#39;numba&#39; dependencies, other packaging cleanup ([#2532](https://github.com/rapidsai/raft/pull/2532)) [@jameslamb](https://github.com/jameslamb)
+- introduce libraft wheels ([#2531](https://github.com/rapidsai/raft/pull/2531)) [@jameslamb](https://github.com/jameslamb)
+- Switch over to rapids-logger ([#2530](https://github.com/rapidsai/raft/pull/2530)) [@vyasr](https://github.com/vyasr)
+- reduce duplication, removed unused things in dependencies.yaml ([#2529](https://github.com/rapidsai/raft/pull/2529)) [@jameslamb](https://github.com/jameslamb)
+- Update cuda-python lower bounds to 12.6.2 / 11.8.5 ([#2522](https://github.com/rapidsai/raft/pull/2522)) [@bdice](https://github.com/bdice)
+- [Opt] Optimizing the performance of `bitmap_to_csr` ([#2516](https://github.com/rapidsai/raft/pull/2516)) [@rhdong](https://github.com/rhdong)
+- prefer system install of UCX in devcontainers, update outdated RAPIDS references ([#2514](https://github.com/rapidsai/raft/pull/2514)) [@jameslamb](https://github.com/jameslamb)
+- Adapt to rmm logger changes ([#2513](https://github.com/rapidsai/raft/pull/2513)) [@vyasr](https://github.com/vyasr)
+- Require approval to run CI on draft PRs ([#2512](https://github.com/rapidsai/raft/pull/2512)) [@bdice](https://github.com/bdice)
+- Shrink wheel size limit following removal of vector search APIs. ([#2509](https://github.com/rapidsai/raft/pull/2509)) [@bdice](https://github.com/bdice)
+- Forward-merge branch-24.12 to branch-25.02 ([#2508](https://github.com/rapidsai/raft/pull/2508)) [@bdice](https://github.com/bdice)
+- Introduction of the `raft::device_resources_snmg` type ([#2487](https://github.com/rapidsai/raft/pull/2487)) [@viclafargue](https://github.com/viclafargue)
+- Add breaking change workflow trigger ([#2482](https://github.com/rapidsai/raft/pull/2482)) [@AyodeAwe](https://github.com/AyodeAwe)
+- Remove &#39;sample&#39; parameter from stats::mean API ([#2389](https://github.com/rapidsai/raft/pull/2389)) [@mfoerste4](https://github.com/mfoerste4)
+
 # raft 24.12.00 (11 Dec 2024)
 
 ## 🚨 Breaking Changes