Skip to content

Commit

Permalink
Respect kernel launch shared memory usage
Browse files Browse the repository at this point in the history
  • Loading branch information
mikex86 committed Sep 4, 2024
1 parent fdd7d53 commit 5011090
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion driverapi/src/cmdqueue.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -568,7 +568,7 @@ NvCommandQueue::launchFunction(LibreCUFunction function,
// check launch dimensions
NvU32 max_threads = ((65536 / roundUp(maxOf(1u, function->num_registers) * 32, 256u)) / 4) * 4 * 32;

uint32_t shmem_usage = function->shared_mem;
uint32_t shmem_usage = maxOf(function->shared_mem, sharedMemBytes);

NvU32 blockProd = blockDimX * blockDimY * blockDimZ;
if ((shmem_usage > sharedMemBytes) && (blockProd > 1024 || max_threads < blockProd)) {
Expand Down

0 comments on commit 5011090

Please sign in to comment.