Skip to content

How to determine the parameters for the t5_gemm file? #96

Answered by byshiue
eycheung asked this question in Q&A
Discussion options

You must be logged in to vote

encoder_size_per_head: What is the size here referring to?

It is size_per_head in encoder, or called by hidden_size in some cases.

encoder_inter_size: What is inter_size? I've seen references suggesting to use 4 * head_num * size_per_head., but I'm not sure if this is something I need to infer from my pretrained model or if this rule of thumb generally works and is something that can be tuned for more optimizations.

The intermediate size of Feed Forward Network.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by byshiue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants