Intel® Advisor Help

About Replacing Annotations with Intel® Cilk™ Plus Code

This topic explains the steps needed to implement parallelism proposed by the Intel Advisor annotations by adding Intel® Cilk™ Plus parallel framework code.

This is the recommended order of tasks for replacing the annotations with Intel Cilk Plus code:

  1. Add appropriate synchronization of shared resources, using LOCK annotations as a guide.
  2. Test to verify you did not break anything, before adding the possibility of non-deterministic behavior with parallel tasks.
  3. Add code to create Intel Cilk Plus tasks or loops, using the SITE/TASK annotations as a guide.
  4. Test with one thread, to verify that your program still works correctly. For example, set the environment variable CILK_NWORKERS to 1 before you run your program.
  5. Test with more than one thread to see that the multithreading works as expected.

Intel Cilk Plus creates worker threads automatically. In general, you should concern yourself only with the tasks, and leave it to the frameworks to create and destroy the worker threads. If needed, you can set the CILK_NWORKERS environment variable and let the Intel Cilk Plus create threads as needed.

If you do need some control over creation and destruction of worker threads, see the Intel Cilk Plus documentation.

The table below shows the serial, annotated program code in the left column and the equivalent Intel Cilk Plus parallel code in the right column for some typical code to which parallelism can be applied.

Serial Code with Intel Advisor Annotations Parallel Code using Intel Cilk Plus
// Synchronization
ANNOTATE_LOCK_ACQUIRE();
  Body();
ANNOTATE_LOCK_RELEASE():
// Locking can use Intel Cilk Plus 
// reducers or mutexes, such as Intel TBB mutexes.

// Parallelize data - one task with
// a counted loop
ANNOTATE_SITE_BEGIN(site);
  For (I = 0; I < N; ++) {
    ANNOTATE_ITERATION_TASK(task);
      Statement;
  }
ANNOTATE_SITE_END();
// Parallelize data - one task, counted loops
#include <cilk/cilk.h>
  ...
  cilk_for (I = 0; I < n; ++I) {
    statement;
  }


// Parallelize functions
ANNOTATE_SITE_BEGIN(site);
  ANNOTATE_TASK_BEGIN(task1);
    function_1();
  ANNOTATE_TASK_END();
  ANNOTATE_TASK_BEGIN(task2);
    function_2();
  ANNOTATE_TASK_END();
ANNOTATE_SITE_END();


// Parallelize functions
#include <cilk/cilk.h>
 ...
 cilk_spawn function_1();
 function_2();
 cilk_sync;
 // Or, if statements are not already function calls,
 // wrap them in a lambda expression:
 cilk_spawn [&]{statement_1}();
  statement_2
 cilk_sync;

See Also