How to build llamafile.exe with Intel OneAPI (MKL) compilers? #430
-
I am trying to run inference on my Windows machine using llamafile.exe. How can I build the llamafile.exe using Intel's One API compilers? Intel MKL has given us higher speed than openBLAS when we compiled the llama.cpp standalone. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
llamafile should already give you very good performance, for the reasons explained in https://justine.lol/matmul/. Basically, you get BLAS-like performance out of the box. If you still want to use MKL, then it's recommended you use https://github.com/ggerganov/llama.cpp/ and follow the instructions at https://github.com/ggerganov/llama.cpp/?tab=readme-ov-file#intel-onemkl |
Beta Was this translation helpful? Give feedback.
llamafile should already give you very good performance, for the reasons explained in https://justine.lol/matmul/. Basically, you get BLAS-like performance out of the box. If you still want to use MKL, then it's recommended you use https://github.com/ggerganov/llama.cpp/ and follow the instructions at https://github.com/ggerganov/llama.cpp/?tab=readme-ov-file#intel-onemkl