[QST] Does GEMM support vectors of alpha and beta? #1398

hychiang-git · 2024-03-13T23:11:38Z

What is your question?
I checked the related issues #1155 and #1000, but I am still not sure if GEMM supports vectors of alpha and beta.

If I have input matrices with shapes: x (1, M, K), w: (K, N), b: (N), alpha: (M), beta: (N). alpha and beta are shared across the batch (bsize=1 now) for simplicity, and I want to perform alpha*matmul(x, w) + beta*b. Which API or example has the closest function I want? Thanks!

The text was updated successfully, but these errors were encountered:

hychiang-git · 2024-03-14T19:24:16Z

Hello, I think it is supported, still but not sure how to use it.

cutlass/include/cutlass/epilogue/thread/scale_type.h

Lines 51 to 60 in ffa34e7

    
           struct ScaleType { 
        
             enum Kind { 
        
               Default,                           // D = scalar_alpha x Acc + scalar_beta x C 
        
               NoBetaScaling,                     // D = scalar_alpha x Acc + C 
        
               OnlyAlphaScaling,                  // D = scalar_alpha x Acc 
        
               PerChannelScaling,                 // D = vector_alpha x Acc + vector_beta x C 
        
               OnlyAlphaPerChannelScaling,        // D = vector_alpha x Acc 
        
               Nothing                            // D = Acc 
        
             }; 
        
           };

cutlass/include/cutlass/epilogue/thread/linear_combination_relu.h

Lines 274 to 279 in ffa34e7

    
           if(Scale == ScaleType::OnlyAlphaPerChannelScaling) 
        
             intermediate = mul_add_accumulator(scale, converted_accumulator, bias);    // D = scale * Accum + bias 
        
           else 
        
             intermediate = mul_add_accumulator(alpha_, converted_accumulator, bias);   // D = alpha * Accum + bias

A similar issue I found here #568

hwu36 · 2024-03-15T05:35:57Z

@apuaaChen , could you comment on how to support it via evt?

apuaaChen · 2024-03-20T06:28:21Z

Hi!

The pattern can be constructed through EVT. You can try to follow the example 47 streamk_broadcast to construct the epilogue. Your pattern should be something like

using OutputTileThreadMap = cutlass::epilogue::threadblock::OutputTileThreadLayout<
  ThreadblockShape, 
  WarpShape, 
  ElementC, 
  AlignmentC, 
  EVTEpilogueStages
>;
// Accumulator
using Accum = cutlass::epilogue::threadblock::VisitorAccFetch;
// alpha
using Alpha = cutlass::epilogue::threadblock::VisitorColBroadcast<
    OutputTileThreadMap, ElementC,
    cute::Stride<_1,_0,int32_t>
>;

// mul
using Mul0 = cutlass::epilogue::threadblock::VisitorCompute<
    cutlass::multiplies, ElementCompute, ElementCompute,
    cutlass::FloatRoundStyle::round_to_nearest
>:

// alpha * accumulator
using EVTMul0 = cutlass::epilogue::threadblock::Sm80EVT<
    Mul0, Alpha, Accum>;

// beta
using Beta = cutlass::epilogue::threadblock::VisitorRowBroadcast<
    OutputTileThreadMap, ElementC,
    cute::Stride<_0, _1, int32_t>  // StrideMNL
>;

// b
using B = cutlass::epilogue::threadblock::VisitorRowBroadcast<
    OutputTileThreadMap, ElementC,
    cute::Stride<_0, _1, int32_t>  // StrideMNL
>;

// mul
using Mul1 = cutlass::epilogue::threadblock::VisitorCompute<
    cutlass::multiplies, ElementCompute, ElementCompute,
    cutlass::FloatRoundStyle::round_to_nearest
>:

// beta * b
using EVTMul1 = cutlass::epilogue::threadblock::Sm80EVT<
    Mul1, Beta, B>;
    
// add
using Add = cutlass::epilogue::threadblock::VisitorCompute<
    cutlass::plus, ElementOutput, ElementCompute,
    cutlass::FloatRoundStyle::round_to_nearest
>;

// alpha * accumulator + beta * b
using EVTAdd = cutlass::epilogue::threadblock::Sm80EVT<
    Add, EVTMul0, EVTMul1>;
    
using D = cutlass::epilogue::threadblock::VisitorAuxStore<
    OutputTileThreadMap, ElementOutput, cutlass::FloatRoundStyle::round_to_nearest,
    cute::Stride<int64_t, _1, int64_t> // StrideMNL
>;

using EVTD = cutlass::epilogue::threadblock::Sm80EVT<
    D,
    EVTAdd>;

hychiang-git · 2024-03-26T14:40:58Z

Thanks!

hychiang-git added ? - Needs Triage question Question labels Mar 13, 2024

hwu36 mentioned this issue Mar 20, 2024

[QST] Gather/Scatter in cute/cutlass 3 #1330

Closed

mnicely removed the ? - Needs Triage label Mar 25, 2024

hychiang-git closed this as completed Mar 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST] Does GEMM support vectors of alpha and beta? #1398

[QST] Does GEMM support vectors of alpha and beta? #1398

hychiang-git commented Mar 13, 2024 •

edited

Loading

hychiang-git commented Mar 14, 2024

hwu36 commented Mar 15, 2024

apuaaChen commented Mar 20, 2024

hychiang-git commented Mar 26, 2024

[QST] Does GEMM support vectors of alpha and beta? #1398

[QST] Does GEMM support vectors of alpha and beta? #1398

Comments

hychiang-git commented Mar 13, 2024 • edited Loading

hychiang-git commented Mar 14, 2024

hwu36 commented Mar 15, 2024

apuaaChen commented Mar 20, 2024

hychiang-git commented Mar 26, 2024

hychiang-git commented Mar 13, 2024 •

edited

Loading