Memory Access out of bounds in mra/cuda_kernel.cu::index_max_cuda_kernel() #35507

dingfen · 2025-01-04T07:02:20Z

System Info

OS: Linux ubuntu 22.04 LTS
Device: A100-80GB
docker: nvidia/pytorch:24.04-py3
transformers: latest, 4.47.0

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

pip install the latest transformers
prepare the UT test enviroments by pip install -e .[testing]
pytest tests/models/mra/test_modeling_mra.py

Analysis

There might be some memory access out-of-bound behaviours in CUDA kernel index_max_cuda_kernel()
https://github.com/huggingface/transformers/blob/main/src/transformers/kernels/mra/cuda_kernel.cu#L6C1-L58C2

Note that max_buffer in this kernel is extern __shared__ float type, which means max_buffer would be stored in shared memory.
According to https://github.com/huggingface/transformers/blob/main/src/transformers/kernels/mra/cuda_launch.cu#L24-L35, CUDA would launch this kernel with

gird size: batch_size
block size: 256
shared memory size: A_num_block * 32 * sizeof(float)

In case that A_num_block < 4, the for statement below might accidentally locate the memory out of A_num_block * 32, since num_thread here is 256, and threadIdx.x is [0, 255].

for (int idx_start = 0; idx_start < 32 * num_block; idx_start = idx_start + num_thread) {

Therefore, when threadblocks of threads try to access max_buffer, it would be wiser and more careful to always add if statements before to avoid memory access out of bounds.

So We suggest to add if statements in two places:

Expected behavior

UT tests should all pass!

The text was updated successfully, but these errors were encountered:

dingfen added the bug label Jan 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Access out of bounds in mra/cuda_kernel.cu::index_max_cuda_kernel() #35507

Memory Access out of bounds in mra/cuda_kernel.cu::index_max_cuda_kernel() #35507

dingfen commented Jan 4, 2025

Memory Access out of bounds in mra/cuda_kernel.cu::index_max_cuda_kernel() #35507

Memory Access out of bounds in mra/cuda_kernel.cu::index_max_cuda_kernel() #35507

Comments

dingfen commented Jan 4, 2025

System Info

Who can help?

Information

Tasks

Reproduction

Reproduction

Analysis

Expected behavior