-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
前端构图API #69
前端构图API #69
Conversation
1bab2fb
to
9b7fc3e
Compare
df05e0c
to
3763a43
Compare
比较多文件缺少尾随空行 |
|
||
key_states = self.transpose(key_states, [0, 1, 3, 2]) | ||
if self.num_kv_groups > 1: | ||
attn_weights = self.matmul_group_k(query_states, key_states) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
实际上全部使用 group matmul 的逻辑就行了,reshape 不产生计算,没有开销
else: | ||
attn_weights = self.matmul(query_states, key_states) | ||
|
||
attn_weights = self.div( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为什么这个不是 gemm 的 α?
attn_weights = self.add(attn_weights, attention_mask) | ||
|
||
if self.dtype != DTYPE.F32: | ||
attn_weights = self.cast(attn_weights, DTYPE.F32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里可以一直 cast,类型不变的 cast 会被跳过,不会计算
9acbb3c
to
594e06b
Compare
18addd0
to
293e98b
Compare
af45b67
to
75a0f87
Compare
No description provided.