You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to use cute tensor to save the sum of different row blocks of a Tensor.
Different thread may write to same location when performing accumulated sum.
Given cute tensor A and B(both are cute::half_t type), we want to accumulate the sum on A.
As the following code shows, I want to sum up A, how to avoid race condition?
Hi,
I am trying to use cute tensor to save the sum of different row blocks of a Tensor.
Different thread may write to same location when performing accumulated sum.
Given cute tensor A and B(both are cute::half_t type), we want to accumulate the sum on A.
As the following code shows, I want to sum up A, how to avoid race condition?
Is there an atomicAdd supported for cute::half_t, I cannot use atomicAdd since it doesn't support half_t.
Thank you so much!
The text was updated successfully, but these errors were encountered: