Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QST] Do batched gemm/gemv support multiple alpha and beta? #1155

Closed
SubjectNoi opened this issue Oct 24, 2023 · 3 comments
Closed

[QST] Do batched gemm/gemv support multiple alpha and beta? #1155

SubjectNoi opened this issue Oct 24, 2023 · 3 comments
Labels
question Question

Comments

@SubjectNoi
Copy link

Given a batched gemm/gemv with say 32 batch, does cutlass support assigning 32 different alpha and beta to 32 batched gemm/gemv? For now I only see the usage of sharing one {alpha, beta} among 32 batch.

@hwu36
Copy link
Collaborator

hwu36 commented Oct 24, 2023

gemv code is here https://github.com/NVIDIA/cutlass/blob/main/include/cutlass/gemm/kernel/gemv.h very simple. easy to hack. (@NVJiangShao)

you can use newly released evt to support vector alpha/beta. check https://github.com/NVIDIA/cutlass/blob/main/examples/47_ampere_gemm_universal_streamk/ampere_gemm_universal_streamk_broadcast.cu. you need #1120 to run it correctly (@apuaaChen). the alternative way is to use the old way like https://github.com/NVIDIA/cutlass/blob/main/test/unit/gemm/device/gemm_f16t_f16n_f16t_tensor_op_f16_broadcast_sm80.cu

@mnicely mnicely changed the title Do batched gemm/gemv support multiple alpha and beta? [QST] [QST] Do batched gemm/gemv support multiple alpha and beta? Nov 7, 2023
@mnicely
Copy link
Collaborator

mnicely commented Dec 5, 2023

@SubjectNoi is your question resolved?

@mnicely
Copy link
Collaborator

mnicely commented Jan 2, 2024

Closing due to inactivity. Feel free to reopen if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question
Projects
None yet
Development

No branches or pull requests

3 participants