-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Support for ROCM (AMD GPUs) #99
Labels
enhancement
New feature or request
Comments
This will require the purchase of some AMD hardware, since I only have NVIDIA-based systems. |
stephanecharette
added a commit
that referenced
this issue
Dec 31, 2024
stephanecharette
added a commit
that referenced
this issue
Dec 31, 2024
stephanecharette
added a commit
that referenced
this issue
Dec 31, 2024
stephanecharette
added a commit
that referenced
this issue
Jan 2, 2025
stephanecharette
added a commit
that referenced
this issue
Jan 2, 2025
… variables that seem to be unused (issue #99)
stephanecharette
added a commit
that referenced
this issue
Jan 2, 2025
stephanecharette
added a commit
that referenced
this issue
Jan 6, 2025
stephanecharette
added a commit
that referenced
this issue
Jan 7, 2025
stephanecharette
added a commit
that referenced
this issue
Jan 7, 2025
stephanecharette
added a commit
that referenced
this issue
Jan 13, 2025
stephanecharette
added a commit
that referenced
this issue
Jan 14, 2025
stephanecharette
added a commit
that referenced
this issue
Jan 14, 2025
stephanecharette
added a commit
that referenced
this issue
Jan 14, 2025
…its_reversed_diagonale() link problems (issue #99)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Summary
Nvidia GPUs are ubiquitous in the market and darknet supports them well. However, it would be nice to have options in the AMD GPU market, which tends to be less expensive and often has more VRAM available for darknet training and usage than many of their Nvidia counterparts.
Though there are two options for non-Nvidia support (OpenCL and HIP/ROCM), the mapping between HIP and CUDA calls should be more of a one-to-one, and easier to support. Various papers/research surveys have also noted that OpenCL is generally slower (often by multiples like 2-8x, depending on operations).
In a hackathon, a group submitted a pull request to the old darknet repo with support for HIP. The integration was crude with #defines used to replace CUDA calls with direct HIP calls. It would be appreciated to review this pull request and/or tools that exist for CUDA->HIP porting to support modern AMD GPU optimizations so that the .weights models can be directly loaded onto the GPU without conversions to onnx.
Goals
The text was updated successfully, but these errors were encountered: