-
Notifications
You must be signed in to change notification settings - Fork 14
Home
HPL-GPU - 2.0 - 2015
HPL-GPU is a largely rewritten version of the traditional High Performance Linpack as published on netlib.org tuned for heterogeneous systems with GPUs. It has been modified to make use of modern multi-core CPUs, enhanced lookahead and a high performance DGEMM for AMD GPUs. It can use AMD CAL, OpenCL, and CUDA as GPU backend. This version of Linpack differs from HPL in multiple aspects, this includes build requirements and configuration, run configuration and license. All of these are covered in this wiki. A lot of technical details regarding the modifications can be found in the references section.
See https://github.com/davidrohr/hpl-gpu/wiki for detailed and up to date informaion.
The How-to provides a detailed tutorial, how to configure, build, tune and run HPL-GPU and CALDGEMM.
Join us in ##caldgemm on IRC our use the CALDGEMM mailing list if you have questions.
This software requires the CALDGEMM library available from http://code.compeng.uni-frankfurt.de/projects/caldgemm HPL-GPU assumes a link to caldgemm inside its top directory and this link must be called caldgemm. CALDGEMM provides backends for CAL, OPENCL, CUDA, CPU. The default is OpenCL and OpenCL is in the following assumed. For compiling CALDGEMM, in principle, you only have to select the desirect backends in config.options.mak that ships with CALDGEMM and compile it.
Both, HPL-GPU and CALDGEMM require a BLAS library. Supported BLAS libraries include Intel MKL, GotoBLAS2, and AMD ACML. The default library is MKL, and MKL is assumed in the following.
HPL-GPU requires Intel TBB. It will usually download and compile TBB automatically during HPl-GPU compilation.
For running HPL-GPU on multiple-nodes, you need an MPI library, we will be default assume OpenMPI.
List of Requirements:
- An MPI library (for multiple nodes), tested are
- OpenMPI
- MVPICH
- MVAPICH2
- A BLAS library
- Intel MKL
- AMD ACML
- GotoBLAS2
- Only the above libraries are tested. Some patches must be applied for GotoBLAS2 and ACML before they can be used, in order to reserve CPU cores for the GPU. Other BLAS libraries might work as well, but similar patches are likely required.
- The CALDGEMM library
- CALDGEMM requires a BLAS library itself.
- CALDGEMM provides several backends for the DGEMM operation:
- CPU Backend: No additional requirements
- OpenCL Backend: Requires an OpenCL 1.2 capable SDK like the AMD APP SDK
- CUDA Backend: Requires the NVIDIA CUDA SDK
- CAL Backend: Requires the deprecated AMD Stream SDK (the newer SDKs do no longer contain the CAL headers).
- C++ compiler
- pthreads support
- Intel(R) Threading Building Blocks
- Optional (The build process will try to download and install TBB if it is not available in the source tree. As long as the computer that builds HPL has internet access you do not have to worry about installing TBB.)
- Required for improved swaps for which you ironically currently need an AMD CPU.
You need to set the following environment variables (Please refer to Environment Variables for details.
-
export AMDAPPSDKROOT=[Root of AMD APP SDK]
-
export OPENMPIPATH=[Install Path of OpenMPI]
-
export MKL_PATH=[Install Path of Intel MKL]
-
export ICC_PATH=[Install Path of Intel Compiler (for libs)]
-
export DISPLAY=:0
(For headless system with X-Server)
In order to allow allocation of large amount of pinned memory you need to set the following limits in Linux:
ulimit -v unlimited
ulimit -m unlimited
ulimit -l unlimited
Older versions of HPL-GPU required HyperThreading to be deactivated. With the current version, if you use either CUDA or OpenCL Runtime, Intek MKL from 2015, and a Haswell-CPU, it is suggested to activate HyperThreading.
You can either download the source from the "files section":/projects/hpl/files or pull it from the git repository.
- You need to install all the prerequisites and set certain environment variables correctly.
- You will have to build the CALDGEMM library first, after settings its build configuration file config_options.mak.
- As the original HPL, HPL-GPU requires a build configuration file called Make.ARCHNAME. The suggested method is to adapt the Make.Generic file, which is preconfigured properly for most cases.
For compiling HPL-GPU after the above prerequisites are met, copy Make.Generic and Make.Generic.Options from the setup directory in its top directory. Principally all relevant options can be controlled in Make.Generic.Options. If you need to change paths for includes / libraries, you have to check Make.Generic.
run ./build.sh to start compilation.
In order to tune HPL-GPU, have a look at the comments in Make.Generic.Options. A detailed description and some tuning comments for CALDGEMM are available in the CALDGEMM README file, and HPL compile time options are listed in setup/readme. A detailed HPL-GPU tuning guide is available in the TUNING file that ships with this software.
See the How-to for details.
As with the original HPL the build will put a binary and a sample configuration into bin/ARCHNAME. The configuration file must be called HPL.dat. Note, however, that the options in the configuration file have changed. Therefore you cannot copy a configuration file from the original HPL, but have to create a configuration file anew. The sample file will give you a working configuration for a single nodes. Note that the HPL-GPU process will by default use all cores of a node, therefore you should only run one process per node. Details can be found in the How-to.
The newest version of HPL-GPU is available at https://github.com/davidrohr/hpl-gpu. The newest version of CALDGEMM Is available at https://github.com/davidrohr/caldgemm.
See the wiki at https://github.com/davidrohr/hpl-gpu/wiki
Are tracked at https://github.com/davidrohr/hpl-gpu/issues
HPL-GPU is made up of parts that are licensed under the GNU General Public License Version 3 and parts that are licensed under the 4-clause BSD license (see https://github.com/davidrohr/hpl-gpu/blob/master/COPYING). The license of each source file is noted in the header of the file. The parts licensed under the GNU General Public License Version 3 grant the following special exception:
"Use with the Original BSD License."
Notwithstanding any other provision of the GNU General Public License Version 3, you have permission to link or combine any covered work witha work licensed under the 4-clause BSD license into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the 4-clause BSD license, clause 3, concerning the requirement of acknowledgement in advertising materials will apply to the combination as such.
HPL-GPU is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
- DMA and memory bandwidth
- CALDGEMM Performance Optimization Guide (CAL OpenCL without GPU_C)
- CALDGEMM Performance Optimization Guide (OpenCL CUDA)
- Thread to core pinning in HPL and CALDGEMM
- Important HPL GPU / CALDGEMM options
Tools / Information
- Analysis Plots of HPL GPU Runs
- Headless System with X Server
- Heterogeneous cluster with different node types
- HPL Compile Time Options
- Catalyst Driver Patch
Reference