Intel® C++ Compiler 16.0 User and Reference Guide
Parallel programs have numerous additional performance considerations and opportunities for tuning and improvement.
In general, the Intel® Cilk™ Plus runtime uses processor resources efficiently using a scheduling algorithm called work stealing. The work stealing algorithm is designed to minimize the number of times that work is moved from one processor to another.
Additionally, the algorithm ensures that space use is bounded linearly by the number of workers. In other words, an Intel® Cilk™ Plus program running on N workers will use no more than N times the amount of memory that is used when running on one worker.