Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QST] Questions about correctness test and layout #1756

Open
haeunlee99 opened this issue Aug 29, 2024 · 2 comments
Open

[QST] Questions about correctness test and layout #1756

haeunlee99 opened this issue Aug 29, 2024 · 2 comments

Comments

@haeunlee99
Copy link

haeunlee99 commented Aug 29, 2024

Hello, I have several question using CUTLASS. It would be very much appreciated to be answered.

  1. How to know if I am calling CUTLASS code correctly?
    I am using matrix A (M x K), matrix B (N x K), matrix C (M x N) and calling cutlass call like below. I transposed B and provided A, B^T and C as argument, which aligns with the layout in the picture. Each A, B, C are half precision, and accumulator is fp32.

Image

I am also calling CUBLAS kernel, but this time providing A and B, making kernel transpose by itself.

Image

When I compare the result for M = N = K = 4096, I see maximum error of 0.25, which seems to be above 0.05 default value given for CUTLASS profiler. I wonder whether changes in instruction shape, warp size or thread block size affects error value. I am using shapes like this picture.

Image

Should I be getting 0 error to know whether I am calling the same kernel with same layout setting?
What is epilson and non-zero floor in this function used by CUTLASS profiler?

Image

  1. What is the correlation between thread block shape, warp shape and instruction shape?
    If I use any of the warp shapes given in documentation of picture below, it gives me error in the second picture.
    Thus, none of the warp shapes I can use to directly use to call GemmUniversal.

Image

Image

Thanks in advance!

@haeunlee99 haeunlee99 changed the title [QST] Questions about correctness, layout and cutlass usage on H100 [QST] Questions about correctness test and layout Aug 29, 2024
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

Copy link

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant