Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matlab Mexcuda support for cufinufft #634

Draft
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

lu1and10
Copy link
Member

@lu1and10 lu1and10 commented Feb 18, 2025

To add support gpuArray in Matlab for Matlab interface with cufinufft as the backend.

  • Matlab cuda guru interface and simple interface
  • Test cpu examples/tests don't break with gpu checks included in simple interface
  • Check Matlab cufinufft_plan.m guru interface
  • Add gpuArray version of check_finufft.m, check_finufft_single.m, math tests and examples.
  • Check simple interface works for gpuArray
  • Add cufinufft_plan.docsrc
  • Adjust simple interface docsrc to incorporate cufinufft support, add gpu opts docs(cpu opts are different)
  • Add instruction to build mexcuda in docs/install_gpu.rst
  • Update docs/matlab.rst
  • fix clang format to exclude finufft.cpp mwrap-generated code and cufinufft.cu


.. code-block:: bash

cmake -S . -B build -D FINUFFT_USE_CUDA=ON -D FINUFFT_STATIC_LINKING=OFF -D CMAKE_VERBOSE_MAKEFILE:BOOL=ON -D FINUFFT_CUDA_ARCHITECTURES="60;70;80;90"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't reallly need to specify the architectures as it should use native by default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we don't have to. But I find it's a bit annoying compiled on my FI cluster workstation with P4000 card, later testing on v100, a100 and v100 recompiling every time, at least for cluster users compiling with different arches save some time.

@ahbarnett
Copy link
Collaborator

Nice progress. Are you making a PR for mwrap github too? (so we can see the new formats for gpuarrays). Will be exciting to have that as a general tool too (I could add examples to mwrapdemo repo...). Let me know if need any help.

@DiamonDinoia DiamonDinoia added this to the 2.4 milestone Feb 18, 2025
@lu1and10
Copy link
Member Author

Are you making a PR for mwrap github too? (so we can see the new formats for gpuarrays). Will be exciting to have that as a general tool too (I could add examples to mwrapdemo repo...).

Yes, I should make a PR for mwrap, I tested mwrap with simple demo https://github.com/lu1and10/mwrap/blob/gpu/testing/test_gpu.mw and our cufinufft code wrap. I should do more cleaning and have more tests C/Fortran, different combinations of gpu array of integers, complex, floating point types with in/out/inout tokens. Maybe I should make a draft PR @zgimbutas and add more tests for mwrp later.

Let me know if need any help.

Yes, I think it will be useful to go through the mexcuda install/compile process on different machines, and Matlab versions to see what kind of problem we could meet. For now, I tested on FI cluster and @haiszhu tested on his ubuntu machine.

@DiamonDinoia
Copy link
Collaborator

DiamonDinoia commented Feb 19, 2025 via email

@lu1and10
Copy link
Member Author

Then I suggest using all-major as the supported architectures depends on the cuda version installed. This fails with cuda 11.3 for example.

That is why there is line saying that "You may adjust FINUFFT_CUDA_ARCHITECTURES to generate the code for different compute capabilities.".
I think it's a use case of FINUFFT_CUDA_ARCHITECTURES which you also use here https://github.com/DiamonDinoia/finufft/blob/d10220f23ae472e660006d62fe5d4638e1a7d4e6/.github/workflows/build_cufinufft_wheels.yml#L61

@zgimbutas
Copy link

zgimbutas commented Feb 19, 2025 via email

@lu1and10
Copy link
Member Author

lu1and10 commented Feb 19, 2025

Then I suggest using all-major as the supported architectures depends on the cuda version installed. This fails with cuda 11.3 for example.

@DiamonDinoia btw, a side finding is that the current https://finufft.readthedocs.io/en/latest/install_gpu.html#cmake-installation does not provide proper install instruction, i.e., cmake -D FINUFFT_USE_CUDA=ON -D CMAKE_CUDA_ARCHITECTURES=80 .. will not generate code for compute_cap 8.0. CMAKE_CUDA_ARCHITECTURES is not assigned to FINUFFT_CUDA_ARCHITECTURES in the current CMakeLists.txt. (

if(NOT DEFINED FINUFFT_CUDA_ARCHITECTURES)
is always false, since FINUFFT_CUDA_ARCHITECTURES defaults to native, always defined.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants