[QST] question about layout #1183

zwshan · 2023-11-13T11:38:57Z

What is your question?
I believe I may have a misunderstanding of the concepts of layout and leading dimension.

My understanding of the leading dimension comes from the responses provided by ChatGPT.

The following is ChatGPT's explanation of the leading dimension:
"Specifically, for a matrix A(m x n), its leading dimension is lda (for column-major order, i.e., column-wise storage, it is m; for row-major order, i.e., row-wise storage, it is n)."

So my question is: since it's possible to determine whether a matrix is row-major or column-major by the size of the leading dimension (whether it's the size of the first or second dimension of the matrix), why is it still necessary to explicitly specify whether a matrix is row-major or column-major?

thakkarV · 2023-11-13T12:06:12Z

The chat GPT explanation is true only for compact matrices with canonical layouts. The statement no longer holds true if the layout is no longer compact or canonical row/col major

zwshan · 2023-11-13T12:17:07Z

The chat GPT explanation is true only for compact matrices with canonical layouts. The statement no longer holds true if the layout is no longer compact or canonical row/col major

"Thank you very much for your response. I apologize for my lack of experience; I'm not quite sure when non-compact matrices and non-canonical row major or column major would be used. Could you provide an example?"

hwu36 · 2023-11-13T18:45:11Z

interleaved gemm is an example. it is supported by both cutlass and cublas (https://github.com/NVIDIA/cutlass/blob/main/include/cutlass/layout/matrix.h#L255).

in dl, we can set leading dimension to be different from the problem size so that we can implicitly split a gemm into several small gemms and call batch gemm directly. e.g. the original big gemm is M x N. the real problem size of the small gemm is M x (N / 3). we can then set the leading dimension to be N, but problem size still be N / 3 to use batch gemm.

zwshan · 2023-11-14T14:15:48Z

interleaved gemm is an example. it is supported by both cutlass and cublas (https://github.com/NVIDIA/cutlass/blob/main/include/cutlass/layout/matrix.h#L255).

in dl, we can set leading dimension to be different from the problem size so that we can implicitly split a gemm into several small gemms and call batch gemm directly. e.g. the original big gemm is M x N. the real problem size of the small gemm is M x (N / 3). we can then set the leading dimension to be N, but problem size still be N / 3 to use batch gemm.

I'm very sorry, please forgive my ignorance. I don't understand why splitting an MxN GEMM into three MxN/3 GEMMs as batches would improve performance.

hwu36 · 2023-11-14T20:23:18Z

I don't understand why splitting an MxN GEMM into three MxN/3 GEMMs as batches would improve performance.

otherwise, you need to explictily split the matrix into three smaller ones.

How should I use the code from the URL I provided earlier for the multiplication of int8 matrices?

that url uses interleaved layout, not canonical row or column major layout.

zwshan · 2023-11-15T03:28:08Z

I have solved my problem,thanks

zwshan added ? - Needs Triage question Question labels Nov 13, 2023

mnicely removed the ? - Needs Triage label Nov 13, 2023

zwshan closed this as completed Nov 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST] question about layout #1183

[QST] question about layout #1183

zwshan commented Nov 13, 2023

thakkarV commented Nov 13, 2023

zwshan commented Nov 13, 2023

hwu36 commented Nov 13, 2023

zwshan commented Nov 14, 2023

hwu36 commented Nov 14, 2023

zwshan commented Nov 15, 2023

[QST] question about layout #1183

[QST] question about layout #1183

Comments

zwshan commented Nov 13, 2023

thakkarV commented Nov 13, 2023

zwshan commented Nov 13, 2023

hwu36 commented Nov 13, 2023

zwshan commented Nov 14, 2023

hwu36 commented Nov 14, 2023

zwshan commented Nov 15, 2023