Intel® Math Kernel Library 11.3 Update 4 Developer Guide
The table below lists the environment variables for Automatic Offload and the functions that cause similar results. See the Intel MKL Developer Reference for detailed descriptions of the functions. To control the division of work between the host CPU and Intel Xeon Phi coprocessors, the environment variables use a fractional measure ranging from zero to one.
Environment Variable |
Support Function |
Description |
Value |
---|---|---|---|
MKL_MIC_ENABLE |
mkl_mic_enable |
Enables Automatic Offload (AO). |
1 |
OFFLOAD_DEVICES |
None |
OFFLOAD_DEVICES is a common setting for Intel MKL and Intel® Compilers. It specifies a list of coprocessors to be used for any offload, including Intel MKL AO. In particular, this setting may help you to configure the environment for an MPI application to run Intel MKL in the AO mode. If this variable is not set, all the coprocessors available on the system are used for AO. You can set this environment variable if AO is enabled by the environment setting or function call. Setting this variable to an empty value is equivalent to completely disabling AO regardless of the value of MKL_MIC_ENABLE. After setting this environment variable, Intel MKL support functions and environment variables refer to the specified coprocessors by their indexes in the list, starting with zero. For more information, refer to the Intel® Compiler User and Reference Guides. |
A comma-separated list of integers, each ranging from 0 to the largest number of an Intel Xeon Phi coprocessor on the system, with the maximum of 31. Values out of this range are ignored. Moreover, if the list contains any non-integer data, the list is ignored completely as if the environment variable were not set at all. For example, if your system has 4 Intel Xeon Phi coprocessors and the value of the list is 1,3, Intel MKL uses only coprocessors 1 and 3 for AO, and Intel MKL support functions and environment variables refer to these coprocessors as coprocessors 0 and 1. |
OFFLOAD_ENABLE_ORSL |
None |
Enables the mode in which Intel MKL and Intel Compilers synchronize their accesses to coprocessors. Set this variable if your application uses both Compiler Assisted and AO but does not implement its own synchronization. |
1 |
mkl_mic_set_workdivision |
Specifies the fraction of work for the host CPU to do. |
A floating-point number ranging from 0.0 to 1.0. For example, the value could be 0.2 or 0.33. Intel MKL ignores negative values and treats values greater than 1 as 1.0. |
|
MKL_MIC_WORKDIVISION |
mkl_mic_set_workdivision |
Specifies the fraction of work to do on all the Intel Xeon Phi coprocessors on the system. |
|
MKL_MIC_<number>_WORKDIVISION |
mkl_mic_set_workdivision |
Specifies the fraction of work to do on a specific Intel Xeon Phi coprocessor. Here <number> is an integer ranging from 0 to the largest number of an Intel Xeon Phi coprocessor on the system, with the maximum of 31. For example, if the system has two Intel Xeon Phi coprocessors, <number> can be 0 or 1. |
|
MKL_MIC_MAX_MEMORY |
mkl_mic_set_max_memory |
Specifies the maximum coprocessor memory reserved for AO computations on all of the Intel Xeon Phi coprocessors on the system. Each process that performs AO computations uses additional coprocessor memory specified by the environment variable. |
Memory size in Kilobytes (K), megabytes (M), gigabytes (G), or terabytes (T). For example, MKL_MIC_MAX_MEMORY = 4096M limits the coprocessor memory reserved for AO computations to 4096 megabytes or 4 gigabytes. Setting MKL_MIC_MAX_MEMORY = 4G specifies the same amount of memory in gigabytes. |
MKL_MIC_<number>_MAX_MEMORY |
mkl_mic_set_max_memory |
Specifies the maximum coprocessor memory reserved for AO computations on a specific Intel Xeon Phi coprocessor on the system. Here <number> is an integer ranging from 0 to the largest number of an Intel Xeon Phi coprocessor on the system, with the maximum of 31. For example, if the system has two Intel Xeon Phi coprocessors, <number> can be 0 or 1. |
Memory size in Kilobytes (K), megabytes (M), gigabytes (G), or terabytes (T). For example, MKL_MIC_MAX_MEMORY = 4096M limits the coprocessor memory reserved for AO computations to 4096 megabytes or 4 gigabytes. Setting MKL_MIC_MAX_MEMORY = 4G specifies the same amount of memory in gigabytes. |
MKL_MIC_REGISTER_MEMORY |
mkl_mic_register_memory |
Enables/disables the mkl_malloc function running in AO mode to register allocated memory. If AO is disabled, this setting has no effect. Setting this environment variable to 1 may improve performance if the same memory region allocated by mkl_malloc is passed multiple times to Intel MKL functions enabled for AO (for a list of AO enabled functions, see Intel MKL Release Notes). |
Desired behavior of mkl_malloc: 0 - not register allocated memory 1 - register allocated memory |
MKL_MIC_RESOURCE_LIMIT |
mkl_mic_set_resource_limit |
Specifies how much of the computational resources of Intel Xeon Phi coprocessors can be used by the calling process. Use this environment variable if you need to share Intel Xeon Phi coprocessor cores automatically across multiple processes that call Intel MKL in the AO mode. For example, this might be useful in MPI applications. Actual reservation is made during a call to an Intel MKL AO function. |
A floating-point number ranging from 0.0 to 1.0. Special values:
Examples:
|
MIC_OMP_NUM_THREADS |
mkl_mic_set_device_num_threads |
Specifies the maximum number of OpenMP* threads to use for AO computations on all the Intel Xeon Phi coprocessors on the system. |
An integer greater than 0. |
MIC_<number>_OMP_NUM_THREADS |
mkl_mic_set_device_num_threads |
Specifies the maximum number of OpenMP threads to use for AO computations on a specific Intel Xeon Phi coprocessor on the system. Here <number> is an integer ranging from 0 to the largest number of an Intel Xeon Phi coprocessor on the system, with the maximum of 31. For example, if the system has two Intel Xeon Phi coprocessors, <number> can be 0 or 1. |
An integer greater than 0. |
OFFLOAD_REPORT |
mkl_mic_set_offload_report |
OFFLOAD_REPORT is a common setting for Intel MKL and Intel® Compilers. It specifies the profiling report level for any offload, including Intel MKL AO. For more information, refer to the Intel® Compiler User and Reference Guides. Note that the mkl_mic_set_offload_report function enables you to turn profile reporting on/off at run time but does not change the reporting level. |
An integer ranging from 0 to 2: 0 - No reporting, default. 1 - The report includes:
2 - In addition to the above information, the report includes:
|
LD_LIBRARY_PATH |
None |
Specifies the search path for host-side dynamic libraries. |
Must contain the path to host-side Intel MIC Platform Software Stack libraries used by Intel MKL. The default path is /opt/intel/mic/coi/host-linux-release/lib. |
MIC_LD_LIBRARY_PATH |
None |
Specifies the search path for coprocessor-side dynamic libraries. |
Must contain:
|
MKL_MIC_THRESHOLDS_?GEMM |
None |
Specifies matrix size thresholds for ?GEMM computations in the AO mode. |
Three comma-separated integers: M, N, K. If this environment variable is set, any call to a ?GEMM function with problem sizes M1, N1, and K1 tries to offload computations only if M1>M, N1>N, and K1>K. This setting is only a hint, and Intel MKL may decide to not offload computations depending on the problem size and environment. Example: To set the thresholds to M=2000, N=1000, K=500 for DGEMM, set MKL_MIC_THRESHOLDS_DGEMM=2000,1000,500. |
Settings specified by the functions take precedence over the settings specified by the respective environment variables.
Intel MKL interprets the values of MKL_HOST_WORKDIVISION, MKL_MIC_WORKDIVISION, and MKL_MIC_<number>_WORKDIVISION as guidance toward dividing work between coprocessors, but the library may choose a different work division if necessary.
For LAPACK routines, setting the fraction of work to any value other than 0.0 enables the specified processor for AO mode. However Intel MKL LAPACK does not use the value specified to divide the workload. For example, setting the fraction to 0.5 has the same effect as setting the fraction to 1.0.