Intel® Advisor Help
You can enable multiple functions to run in parallel as two or more tasks. The statements to run in parallel are not limited to function calls (see the help topic Data and Task Parallelism).
When the outermost statements in the annotation site have been placed into tasks, as shown in the following serial example, it is easy to execute them in parallel.
ANNOTATE_SITE_BEGIN(sitename); ANNOTATE_TASK_BEGIN(task1); statement-1 ANNOTATE_TASK_END(); ANNOTATE_TASK_BEGIN(task2); statement-2 ANNOTATE_TASK_END(); ANNOTATE_TASK_BEGIN(task3); statement-3 ANNOTATE_TASK_END(); ANNOTATE_SITE_END();
With Intel Cilk Plus, the unit of parallel work is either a function call or a loop iteration.
If statement-1, statement-2, and statement-3 are function calls (function-1, function-2 and function-3), start by prefacing all but the last function call with the _Cilk_spawn keyword, and add a _Cilk_sync afterwards:
_Cilk_spawn function-1(); _Cilk_spawn function-2(); function-3(); _Cilk_sync;
The _Cilk_sync forces execution to wait until all calls complete. If the last _Cilk_spawn is immediately followed by a _Cilk_sync, omit the last _Cilk_spawn keyword (as shown above).
If any of the statements are not already function calls, wrap those non-function-call statements - except the last one - in a C++11 Lambda expression to create the finished parallel code:
_Cilk_spawn [&]{statement-1}(); _Cilk_spawn [&]{statement-2}(); statement-3 _Cilk_sync;
The () after each lambda expression creates a function call. This is needed because _Cilk_spawn requires a function call.
A variable used inside a lambda expression but declared outside it is said to be captured. The [&] in the example specifies capture by reference. It is also possible to capture by value [=], or even capture different variables different ways. See the compiler documentation on lambda expressions for details.
Instead of using the _Cilk_spawn and _Cilk_sync keywords directly, you can improve readability by adding #include <cilk/cilk.h> in your program and use cilk_spawn and cilk_sync keywords instead.