Running the Benchmark on One Node

To run the Intel Optimized MP LINPACK Benchmark binary on a cluster node, follow the steps below.

Note

While these instructions assume the Intel® 64 architecture, they are more widely applicable. The instructions directly apply to Previous Generation Intel® Core™ or higher Intel® processors. For IA-32 architecture processors and for earlier Intel® 64 architecture processors, omit the version parameter of the make command. For IA-32 architecture processors, also adjust directory names and the value of the arch parameter.

Load the necessary environment variables for Intel MKL, Intel MPI, and the Intel® compiler and build the binary:

<parent directory>/bin/compilervars.sh intel64

<mpi directory>/bin64/mpivars.sh

<mkl directory>/bin/mklvars.sh intel64

make arch=intel64 version=offload
Change directory to bin/intel64:

cd <mkl directory>/benchmarks/mp_linpack/bin/intel64

This directory contains files:
- xhpl - the Intel® 64 architecture binary.
- HPL.dat - the HPL input data set.
Execute the binary for a small test run:

./xhpl
Modify the HPL.dat file to match the memory on the host processor by increasing the value in line 6 before Ns:
- For 16 GB: 12000 Ns
- For 32 GB: 56000 Ns
- For 64 GB: 83000 Ns
In general, you can compute the memory required to store the matrix (which does not count numerous buffers) as 8 * N * N / (P * Q) bytes, where N is the problem size, and P and Qare the process grids in HPL.dat. HPL documentation generally recommends choosing a problem size that fills 80% of memory, but you can sometimes use more.
Execute the binary again and take note of the new result.
./xhpl

For specifics of running hybrid offload binaries, see Running Hybrid Offload Binaries.

Running the Benchmark on One Node

Note

See Also