You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@sguada published the first profiling result in #81. Starting from there, we can do some more in depth analysis of key success factors such as occupancy to improve device utilization. Occupancy is defined as the ratio of active warps versus the maximum number of warps of a GPU. CUDA visual profiler and occupancy calculator both provide such data.
The best practice guide gives some general principles of execution configuration optimizations to effectively manage the resource utilization. Jared Hoberock, a NVIDIA researcher and co-creator of CUDA template library Thrust, put them into practice with adaptive CUDA launch configurations whose only essential dependency is cuda_runtime_api.h which will not introduce any new dependency into Caffe.
The text was updated successfully, but these errors were encountered:
@sguada published the first profiling result in #81. Starting from there, we can do some more in depth analysis of key success factors such as occupancy to improve device utilization. Occupancy is defined as the ratio of active warps versus the maximum number of warps of a GPU. CUDA visual profiler and occupancy calculator both provide such data.
The best practice guide gives some general principles of execution configuration optimizations to effectively manage the resource utilization. Jared Hoberock, a NVIDIA researcher and co-creator of CUDA template library Thrust, put them into practice with adaptive CUDA launch configurations whose only essential dependency is cuda_runtime_api.h which will not introduce any new dependency into Caffe.
The text was updated successfully, but these errors were encountered: