Does cuda support vgpu driver ？ #51

hw0505 · 2021-05-17T16:53:49Z

Hello,I encountered a problem. The GPU I used was 2070 and the driver version was Linux KVM 7.9. When I install the vgpu driver and cuda in the virtual machine, there will be a problem that the vgpu driver and cuda do not match. Does cuda support vgpu? If you support it, what version of cuda can adapt to the vgpu driver.Thank you very much!

DualCoder · 2021-05-18T18:35:01Z

The CUDA versions supported by the different vGPU releases are listed here: https://docs.nvidia.com/cuda/vGPU/index.html

Additionally some feature limitations exist, as documented here: https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/#cuda-open-cl-support-vgpu (in addition to the GPUs listed vgpu_unlock also adds support for most Pascal and Turing GPUs, so your 2070 is expected to work).

hw0505 · 2021-05-19T12:28:00Z

The CUDA versions supported by the different vGPU releases are listed here: https://docs.nvidia.com/cuda/vGPU/index.html

Additionally some feature limitations exist, as documented here: https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/#cuda-open-cl-support-vgpu (in addition to the GPUs listed vgpu_unlock also adds support for most Pascal and Turing GPUs, so your 2070 is expected to work).

Thanks again. I found the vGPU software version corresponding to cuda from the link you provided. I installed vGPU software release 12.2 (NVIDIA-Linux-x86_64-460.73.02-vgpu-kvm.run) and cuda 11.2 (cuda_11.2.0_460.27.04_linux.run) in the virtual machine. When I execute the nvidia-smi command, I can see the version of the vGPU driver, but I cannot see the version of cuda. When I run the cuda program (calling the "cudaGetDeviceCount" function), the error "CUDA driver version is insufficient for CUDA runtime version" appears. I have a few questions to ask: 1. Have you successfully used the cuda library in the virtual machine? 2. How do you verify that the vGPU generated by "vgpu_unlock" can work normally in the virtual machine? Can you share your verification method? I want to reproduce it in my environment.
By the way, I did not assign a license to the vGPU in the virtual machine. Could this be the reason for the failure? In the vGPU wiki, I did not find the specific steps for assigning a license to a virtual machine.
Hope to get your reply soon!

DualCoder · 2021-05-21T19:02:25Z

I installed vGPU software release 12.2 (NVIDIA-Linux-x86_64-460.73.02-vgpu-kvm.run) and cuda 11.2 (cuda_11.2.0_460.27.04_linux.run) in the virtual machine.

That is not correct. The *-vgpu-kvm.run driver is supposed to be installed on the host, not in the guest. This driver does not support CUDA, so you will not have CUDA support on the host. The *-grid.run driver is supposed to be installed in the guest, this driver does have CUDA support and you can have CUDA in the guest (with Q and C profiles).

Have you successfully used the cuda library in the virtual machine?

Yes, cuda_11.3.1_465.19.01_linux.run and NVIDIA-Linux-x86_64-460.32.03-grid.run installed in the VM.

How do you verify that the vGPU generated by "vgpu_unlock" can work normally in the virtual machine? Can you share your verification method? I want to reproduce it in my environment.

For verifying that CUDA and OpenCL runs I have successfully executed the example scripts from hashcat 5.1.0 (https://hashcat.net), I didn't manage to get any of the newer versions to work with CUDA though, only OpenCL. I have not done any performance evaluation of CUDA or OpenCL.

For verifying that OpenGL works, I have run the Heaven Benchmark and just checked that it's running at 60 fps (frame rate limiter enabled). I have run some other graphics benchmarks to compare performance, but my testing is incomplete and I do not have any reliable conclusions yet.

By the way, I did not assign a license to the vGPU in the virtual machine. Could this be the reason for the failure? In the vGPU wiki, I did not find the specific steps for assigning a license to a virtual machine.

No, that is not the reason for the failure. Without a license CUDA will work normally for 20 mins after guest bootup and then suffer severe performance penalties, see https://docs.nvidia.com/grid/latest/grid-licensing-user-guide/index.html#software-enforcement-grid-licensing for further details. That page also has information on how to assign a license to the virtual machine.

hw0505 · 2021-07-07T14:01:41Z

Thanks again! Have you ever tested vgpu_unlock on gtx 1080ti or titan xp? I tested it on these two GPUs using the case that comes with cuda. When I use the matrixMul case, it prompts "error: all devices have compute mode prohibited". When I use the vectorAdd case, it prompts "error code all CUDA-capable devices are busy or unavailable".

DualCoder · 2021-07-11T15:15:40Z

Tested on TITAN X (Pascal):

Host:

$ nvidia-smi
Sun Jul 11 17:10:12 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.04    Driver Version: 460.32.04    CUDA Version: N/A      |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  TITAN X (Pascal)    On   | 00000000:01:00.0 Off |                  N/A |
|  0%   33C    P8    18W / 250W |   8166MiB / 12287MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     18600      G   vgpu                             8126MiB |
+-----------------------------------------------------------------------------+

Guest:

$ nvidia-smi
Sun Jul 11 17:07:57 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.73.01    Driver Version: 460.73.01    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GRID P40-8Q         Off  | 00000000:07:01.0  On |                  N/A |
| N/A   N/A    P8    N/A /  N/A |    666MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A       738      G   /usr/lib/xorg/Xorg                 94MiB |
+-----------------------------------------------------------------------------+

$ ./matrixMul 
[Matrix Multiply Using CUDA] - Starting...
GPU Device 0: "Pascal" with compute capability 6.1

MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel...
done
Performance= 1203.68 GFlop/s, Time= 0.109 msec, Size= 131072000 Ops, WorkgroupSize= 1024 threads/block
Checking computed result for correctness: Result = PASS

NOTE: The CUDA Samples are not meant for performancemeasurements. Results may vary when GPU Boost is enabled.

$ ./vectorAdd_nvrtc 
> Using CUDA Device [0]: GRID P40-8Q
> Using CUDA Device [0]: GRID P40-8Q
> GPU Device has SM 6.1 compute capability
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

$ ./vectorAddMMAP 
Vector Addition (Driver API)
> Using CUDA Device [0]: GRID P40-8Q
Device 0 VIRTUAL ADDRESS MANAGEMENT SUPPORTED = 0.
Device 0 doesn't support VIRTUAL ADDRESS MANAGEMENT.

hw0505 · 2021-07-12T08:50:47Z

According to the information you provided, I generated a VGPU with a type of P40-8Q.  When I use cuda's test sample to test, the test sample can run at first, but it will get stuck, as shown in the figure below. Host： Guest（titan xp）： ./matrixMul ./vectorAdd

…

------------------ 原始邮件 ------------------ 发件人: "DualCoder/vgpu_unlock" ***@***.***>; 发送时间: 2021年7月11日(星期天) 晚上11:15 ***@***.***>; ***@***.******@***.***>; 主题: Re: [DualCoder/vgpu_unlock] Does cuda support vgpu driver ？ (#51) Tested on TITAN X (Pascal): Host: $ nvidia-smi Sun Jul 11 17:10:12 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.32.04 Driver Version: 460.32.04 CUDA Version: N/A | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 TITAN X (Pascal) On | 00000000:01:00.0 Off | N/A | | 0% 33C P8 18W / 250W | 8166MiB / 12287MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 18600 G vgpu 8126MiB | +-----------------------------------------------------------------------------+ Guest: $ nvidia-smi Sun Jul 11 17:07:57 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.73.01 Driver Version: 460.73.01 CUDA Version: 11.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GRID P40-8Q Off | 00000000:07:01.0 On | N/A | | N/A N/A P8 N/A / N/A | 666MiB / 8192MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 738 G /usr/lib/xorg/Xorg 94MiB | +-----------------------------------------------------------------------------+ $ ./matrixMul [Matrix Multiply Using CUDA] - Starting... GPU Device 0: "Pascal" with compute capability 6.1 MatrixA(320,320), MatrixB(640,320) Computing result using CUDA Kernel... done Performance= 1203.68 GFlop/s, Time= 0.109 msec, Size= 131072000 Ops, WorkgroupSize= 1024 threads/block Checking computed result for correctness: Result = PASS NOTE: The CUDA Samples are not meant for performancemeasurements. Results may vary when GPU Boost is enabled. $ ./vectorAdd_nvrtc > Using CUDA Device [0]: GRID P40-8Q > Using CUDA Device [0]: GRID P40-8Q > GPU Device has SM 6.1 compute capability [Vector addition of 50000 elements] Copy input data from the host memory to the CUDA device CUDA kernel launch with 196 blocks of 256 threads Copy output data from the CUDA device to the host memory Test PASSED Done $ ./vectorAddMMAP Vector Addition (Driver API) > Using CUDA Device [0]: GRID P40-8Q Device 0 VIRTUAL ADDRESS MANAGEMENT SUPPORTED = 0. Device 0 doesn't support VIRTUAL ADDRESS MANAGEMENT. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does cuda support vgpu driver ？ #51

Does cuda support vgpu driver ？ #51

hw0505 commented May 17, 2021

DualCoder commented May 18, 2021

hw0505 commented May 19, 2021

DualCoder commented May 21, 2021

hw0505 commented Jul 7, 2021

DualCoder commented Jul 11, 2021

hw0505 commented Jul 12, 2021 via email

Does cuda support vgpu driver ？ #51

Does cuda support vgpu driver ？ #51

Comments

hw0505 commented May 17, 2021

DualCoder commented May 18, 2021

hw0505 commented May 19, 2021

DualCoder commented May 21, 2021

hw0505 commented Jul 7, 2021

DualCoder commented Jul 11, 2021

hw0505 commented Jul 12, 2021 via email