[QST] Volta shared memory layout example #1158

ssiu · 2023-10-25T11:23:46Z

Hi, I am currently studying these slides: https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s9593-cutensor-high-performance-tensor-operations-in-cuda-v2.pdf

I was wondering if there's any sample code for loading data from global memory to shared memory as shown in page 23?

Can I also ask why there does not exist copy_sm70.hpp in cutlass/include/cute/arch and copy_traits_sm70.hpp in cutlass/include/cute/atom?

Thanks!

The text was updated successfully, but these errors were encountered:

thakkarV · 2023-10-25T12:05:52Z

No special copy atoms are needed for Volta as all copies were thread copies with CUDA C++ exposure. The Default/Universal copy is sufficient pre Ampere

ssiu added ? - Needs Triage question Question labels Oct 25, 2023

ssiu closed this as completed Oct 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST] Volta shared memory layout example #1158

[QST] Volta shared memory layout example #1158

ssiu commented Oct 25, 2023 •

edited

Loading

thakkarV commented Oct 25, 2023

[QST] Volta shared memory layout example #1158

[QST] Volta shared memory layout example #1158

Comments

ssiu commented Oct 25, 2023 • edited Loading

thakkarV commented Oct 25, 2023

ssiu commented Oct 25, 2023 •

edited

Loading