-
I am new to Llama-cpp-python. I have worked quite a bit over the past few weeks on using llama-cpp-python. AMD Mobile GPU Usage--How To Tell? How does one determine whether llama-cpp-python uses the AMD GPU card? How do you tell? I have a laptop with a AMD Radeon 6650M type GPU. These are discrete cards and also there is a CPU graphics capability. As I understand, typically the OS switches between cards. The system has 32GB of DDR6 and higher end NVME drives. (I added an additional 2TB drive and more memory to the base unit.) I run POP OS 22.04 (based on Ubuntu 22.04). I have the AMD ROCm 6.0 installed. I compile llama-cpp-python with something like I researched the "how do you tell" topic but find little information on AMD cards. Some sites suggest looking for a gpu_offload flag or similar text in the Llama.cpp model report etc. No matter what I do, I see no such flags. If I run System Monitor (the Linux analog to Windows Task Manager) when running a prompt, I get high usage on the 16 CPU units. But I see no textual output from the Llama.cpp model report or other information to show any usage of the GPU. Because my laptop has the discrete/CPU-GPU capabilities, I also recently found the main_gpu flag for Llama.cpp. I set this to 0 (onboard GPU) or 1 (6650 discrete GPU) based on Has anyone else used Llama.cpp-python with an AMD config similar to mine? Are there any ideas for AMD products regarding how to tell if the GPU is used at all? (I appreciate help from NVidia folks. The Nvidia tools and AMD tools seem radically different whereas I cannot find any means to detect, on Linux, whether the AMD GPU is used at all.) Any help/discussion/suggestions for a newbie would be appreciated. Sample of Llama.cpp Model OutputParameters
OUTPUT
Sample of clinfo Output
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Check vram usage with radeontop |
Beta Was this translation helpful? Give feedback.
-
Thank you @stduhpf ! This worked well. If any other newbies find this post for AMD cards, I installed Run in a terminal with A manpage for You can get color output with: NOTE: @stduhpf 's response is very helpful. However, the underlying problem that I encountered is resolved in this post#1066. The underlying problem was building llama-cpp-python. The instructions in this post correctly built llama-cpp-python and enabled the ROCm support for LLM use. Carefully note the build tags in this other post. |
Beta Was this translation helpful? Give feedback.
Check vram usage with radeontop