Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] cutlass conv2d kernel failed when N*H*W*C*2 larger than 0x100000000 [4G] #1262

Closed
leiwen83 opened this issue Dec 11, 2023 · 4 comments
Closed
Labels
feature request New feature or request

Comments

@leiwen83
Copy link

Describe the bug

I try tune the conv2d kernel whose input feature larger than 4G, and then get failure.
Seems to me there seems need some change for this support?

Steps/Code to reproduce bug
Follow this guide http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports to craft a minimal bug report. This helps us reproduce the issue you're having and resolve the issue more quickly.

./tools/profiler/cutlass_profiler --n=1 --h=5120 --w=8192 --c=32 --k=8 --r=3 --s=3 --pad_h=1 --pad_w=1 --stride_h=1 --stride_w=1 --dilation_h=1 --dilation_w=1
@leiwen83 leiwen83 added ? - Needs Triage bug Something isn't working labels Dec 11, 2023
@hwu36
Copy link
Collaborator

hwu36 commented Dec 11, 2023

we don't support that. to support it, you need to change all the strides to 64bits.

@mnicely mnicely added feature request New feature or request and removed bug Something isn't working ? - Needs Triage labels Dec 11, 2023
@leiwen83
Copy link
Author

we don't support that. to support it, you need to change all the strides to 64bits.

I see...
For all strides to 64bits, do you have any suggested file/struture need to be modified?
Currently Stride is using Index as int32_t in include/cutlass/tensor_ref.h, so if make Index as int64_t, could this issue be fixed?

@hwu36
Copy link
Collaborator

hwu36 commented Dec 12, 2023

you need to start from https://github.com/NVIDIA/cutlass/blob/main/include/cutlass/layout/tensor.h#L67. change this one to using Stride = Coord<kStrideRank, LongIndex >;, delete this one https://github.com/NVIDIA/cutlass/blob/main/include/cutlass/layout/tensor.h#L116 . Then fix any warnings/errors complaining int64->int32 conversion.

@leiwen83
Copy link
Author

Got it, thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants