This sample application performs general matrix multiplication using OpenMP* on CPU or GPU, so it can be used as a target for OpenMP* profiling and tracing tools.
OpenMP Matrix Multiplication (matrix size: 2048 x 2048, repeats 4 times)
Target device: GPU
Matrix multiplication time: 1.55599 sec
Results are CORRECT with accuracy: 5.86231e-06
Matrix multiplication time: 1.27369 sec
Results are CORRECT with accuracy: 5.86231e-06
Matrix multiplication time: 1.27024 sec
Results are CORRECT with accuracy: 5.86231e-06
Matrix multiplication time: 1.27694 sec
Results are CORRECT with accuracy: 5.86231e-06
Total execution time: 5.37699 sec
- Linux
- Windows (under development)
- CMake (version 3.12 and above)
- Git (version 1.8 and above)
- Python (version 2.7 and above)
- Intel(R) oneAPI Base Toolkit
Run the following commands to build the sample (make sure you have Intel(R) C++ Compiler in PATH
for building):
source <inteloneapi>/setvars.sh
cd <pti>/samples/omp_gemm
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make
Use this command line to run the application:
./omp_gemm [cpu|gpu] [matrix_size] [repeat_count]
Use Microsoft* Visual Studio x64 command prompt to run the following commands and build the sample (make sure you have Intel(R) C++ Compiler in PATH
for building):
<inteloneapi>\setvars.bat
cd <pti>\samples\omp_gemm
mkdir build
cd build
icx.exe ../main.cc /Qopenmp /Qopenmp-targets=spir64 -I../../../utils -o omp_gemm.exe
Use this command line to run the application:
omp_gemm.exe [cpu|gpu] [matrix_size] [repeats_count]