Intel® Math Kernel Library 11.3 Update 4 Developer Guide
Use the following techniques to specify the number of OpenMP threads to use in Intel MKL:
A call to the mkl_set_num_threads or mkl_domain_set_num_threads function changes the number of OpenMP threads available to all in-progress calls (in concurrent threads) and future calls to Intel MKL and may result in slow Intel MKL performance and/or race conditions reported by run-time tools, such as Intel® Inspector.
To avoid such situations, use the mkl_set_num_threads_local function (see the "Support Functions" section in the Intel MKL Developer Reference for the function description).
When choosing the appropriate technique, take into account the following rules:
If you use the Intel TBB threading technology, read the documentation for the tbb::task_scheduler_init class at https://www.threadingbuildingblocks.org/documentation to find out how to specify the number of threads.