Intel® Math Kernel Library 11.3 Update 4 Developer Guide

Running the Benchmark on One Node

To run the Intel Optimized MP LINPACK Benchmark binary on a cluster node, follow the steps below.

Note

While these instructions assume the Intel® 64 architecture, they are more widely applicable. The instructions directly apply to Previous Generation Intel® Core™ or higher Intel® processors. For IA-32 architecture processors and for earlier Intel® 64 architecture processors, omit the version parameter of the make command. For IA-32 architecture processors, also adjust directory names and the value of the arch parameter.

  1. Load the necessary environment variables for Intel MKL, Intel MPI, and the Intel® compiler and build the binary:

    <parent directory>/bin/compilervars.sh intel64

    <mpi directory>/bin64/mpivars.sh

    <mkl directory>/bin/mklvars.sh intel64

    make arch=intel64 version=offload

  2. Change directory to bin/intel64:

    cd <mkl directory>/benchmarks/mp_linpack/bin/intel64

    This directory contains files:

    • xhpl - the Intel® 64 architecture binary.

    • HPL.dat - the HPL input data set.

  3. Execute the binary for a small test run:

    ./xhpl

  4. Modify the HPL.dat file to match the memory on the host processor by increasing the value in line 6 before Ns:

    • For 16 GB: 12000 Ns

    • For 32 GB: 56000 Ns

    • For 64 GB: 83000 Ns

    In general, you can compute the memory required to store the matrix (which does not count numerous buffers) as 8 * N * N / (P * Q) bytes, where N is the problem size, and P and Q are the process grids in HPL.dat. HPL documentation generally recommends choosing a problem size that fills 80% of memory, but you can sometimes use more.

  5. Execute the binary again and take note of the new result.

    ./xhpl

For specifics of running hybrid offload binaries, see Running Hybrid Offload Binaries.

See Also