How to build llamafile.exe with Intel OneAPI (MKL) compilers? #430

llm-finetune · 2024-05-21T04:36:04Z

llm-finetune
May 21, 2024

I am trying to run inference on my Windows machine using llamafile.exe. How can I build the llamafile.exe using Intel's One API compilers? Intel MKL has given us higher speed than openBLAS when we compiled the llama.cpp standalone.
My machine configurations: - Intel® SSE4.1, Intel® SSE4.2, Intel® AVX2, Intel® AVX-512

Answered by jart

May 21, 2024

llamafile should already give you very good performance, for the reasons explained in https://justine.lol/matmul/. Basically, you get BLAS-like performance out of the box. If you still want to use MKL, then it's recommended you use https://github.com/ggerganov/llama.cpp/ and follow the instructions at https://github.com/ggerganov/llama.cpp/?tab=readme-ov-file#intel-onemkl

View full answer

jart · 2024-05-21T05:02:47Z

jart
May 21, 2024
Maintainer

llamafile should already give you very good performance, for the reasons explained in https://justine.lol/matmul/. Basically, you get BLAS-like performance out of the box. If you still want to use MKL, then it's recommended you use https://github.com/ggerganov/llama.cpp/ and follow the instructions at https://github.com/ggerganov/llama.cpp/?tab=readme-ov-file#intel-onemkl

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to build llamafile.exe with Intel OneAPI (MKL) compilers? #430

{{title}}

Replies: 1 comment

{{title}}

Select a reply

How to build llamafile.exe with Intel OneAPI (MKL) compilers? #430

llm-finetune May 21, 2024

Replies: 1 comment

jart May 21, 2024 Maintainer

llm-finetune
May 21, 2024

jart
May 21, 2024
Maintainer