Intel® Fortran Compiler 16.0 User and Reference Guide
This topic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).
The OpenMP* runtime has the following default on Intel® MIC Architecture:
The OpenMP* runtime creates four threads per core by default under the coprocessor OS.
If you do not specify the number of threads, the OpenMP* runtime uses the number of logical processors available in the affinity mask of the thread that initially goes parallel. In the offload environment this behavior normally is set to avoid the last physical core in the machine, which is where logical processors 0 and N-3, N-2, N-1 run. Therefore, on a coprocessor with 32 cores, for offloaded code, the runtime defaults to creating 124 OpenMP* threads.
To override this behavior, and explicitly use all 128 threads to run offloaded code, use an explicit affinity and the norespect modifier in the affinity specification, to force the runtime to ignore the affinity mask it was given.
When natively executing code on the coprocessor, no initial affinity mask is set, and the OpenMP* runtime defaults to using all available hardware threads. For example, it will use 128 threads on a 32 core machine.