Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Can I use copy to store register value into shared memory? #1233

Closed
ziyuhuang123 opened this issue Dec 5, 2023 · 11 comments
Closed

[FEA] Can I use copy to store register value into shared memory? #1233

ziyuhuang123 opened this issue Dec 5, 2023 · 11 comments
Labels
CuTe CuTe Functionality feature request New feature or request inactive-30d

Comments

@ziyuhuang123
Copy link

Now I have tCrC, and I want to store them into shared memory. Can copy function do that? Thanks!

@ziyuhuang123
Copy link
Author

I mean, using CuTe.

@ccecka
Copy link

ccecka commented Dec 5, 2023

If you've created your tCrC using some partitioner, then that same partitioner should be applied to an smem tensor. For example, we usually see this pattern:

Tensor gC = ...                                          // (BLK_M,BLK_N)
Tensor sC = make_tensor(make_smem_ptr(ptr), shape(gC));  // (BLK_M,BLK_N)

auto thr_mma = tiled_mma.get_slice(thread_idx);

Tensor tCgC = thr_mma.partition_C(gC);                   // (MMA,MMA_M,MMA_N)
Tensor tCsC = thr_mma.partition_C(sC);                   // (MMA,MMA_M,MMA_N)
Tensor tCrC = thr_mma.make_fragment_C(tCgC);             // (MMA,MMA_M,MMA_N)

...

copy(tCrC, tCgC);
// or
copy(tCrC, tCsC);
copy(tCsC, tCgC);
// or
copy(tCrC, tCsC);
if (threadIdx.x == 0) {
  copy(sC, gC);   // Copy the whole tile
}
// or re-partition sC and gC, etc

Which should, of course, also work with any shared memory layout for sC so long as its shape is still (compatible with) BLK_M x BLK_N.

@ziyuhuang123
Copy link
Author

ziyuhuang123 commented Dec 5, 2023

Thank you very much for your reply!!!!

I noticed you are using "auto thr_mma = tiled_mma.get_slice(thread_idx);"
So what is its difference with: "auto tAgA = local_partition(gA, tA, threadIdx.x); // (THR_M,THR_K,k)"
??

@ccecka
Copy link

ccecka commented Dec 5, 2023

The first is constructing an MMA partitioner from a TiledMMA (which is usually used to create tCxY partitioned tensors) and the second is partitioning with respect to the thread layout tA.

@ziyuhuang123
Copy link
Author

Emmmm, so the output is the same, right? I mean, outputs are just "tensor" type, no matter how I get a "tensor", I can do "copy(tCrC, tCsC);" and cute will find a way to do the copy?

Thank you!!!

@ccecka
Copy link

ccecka commented Dec 5, 2023

No, they are not related as they apply distinct partitioning patterns.

I suggest you review the existing documentation and wait for our updated documentation+examples coming soon.

@ziyuhuang123
Copy link
Author

Wow! That's cool! Actually I am writing relavant code pushed by my supervisor, haha.... Can not wait too long~He is a nice guy, I mean, anyway I am also interested in cute by myself also.

Eagerly waiting for your update!

@ziyuhuang123
Copy link
Author

Currently I only see one example code.... Do you possibly know more codes written by cute?

@ziyuhuang123
Copy link
Author

Oh, I noticed most gemm codes are written using previous cutlass.... Maybe I should use older version...? Because that doc is more....

@mnicely mnicely added CuTe CuTe Functionality and removed ? - Needs Triage labels Dec 6, 2023
Copy link

github-actions bot commented Jan 5, 2024

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@mnicely
Copy link
Collaborator

mnicely commented Feb 22, 2024

Closing due to inactivitly

@mnicely mnicely closed this as completed Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CuTe CuTe Functionality feature request New feature or request inactive-30d
Projects
None yet
Development

No branches or pull requests

3 participants