Intel® C++ Compiler 16.0 User and Reference Guide
This topic addresses specific C++ language features that better help to parallelize code.
Annotating functions with the declaration:
// (Windows* OS) __declspec(concurrency_safe(cost(cycles) | profitable)) -OR- // (Linux* OS) __attribute__(concurrency_safe(cost(cycles) | profitable))guides the compiler to parallelize more loops and straight-line code.
Using the concurrency_safe attribute indicates to the compiler that there are no unaffected side-effects and no illegal (or improperly synchronized) memory access interfences among multiple invocations of the annotated function or between an invocation of this annotated function and other statements in the program, if they are executed concurrently.
For every function that is annotated with the concurrency_safe attribute, it is your responsibility to ensure that its side effects (if any) are acceptable (or expected), and the memory access interferences are properly synchronized.
The cost clause specifies the execution cycles of the annotated function for the compiler to perform parallelization profitability analysis while compiling its enclosing loops or blocks. The profitable clause indicates that the loops or blocks that contain calls to the annotated function are profitable to parallelize.
The value of cycles is a 2-byte unsigned integer (unsigned short), its maximal value is 2^16-1. If the cycle count is greater than 2^16-1, the user should use profitable clause.
The following example illustrates the use of this declaration.
Example using __declspec(concurrency_safe(cost(cycles) | profitable)) |
---|
#define N 10 #define M 40 #define NValue N #if defined(COSTLOW) // The function cost is ~5 cycles, the loop calling "foo" will not be parallellized __declspec(concurrency_safe(cost(5))) #elif defined(COSTHIGH) // The function cost is ~100 cycles, so the loop calling "foo" will be paralleized __declspec(concurrency_safe(cost(200))) #elif defined(PROFITABLE) // The function is profitable to be executed in parallel, so the loop calling "foo" // should be paralleized. __declspec(concurrency_safe(profitable)) #endif __declspec(noinline) int foo(float A[], float B[]) { for (int i = 0; i < N; i++) { B[i] = A[i]; } return N; } int testp(float A[], float B[], float* In[], float* Out[]) { int i, j; for (i = 0; i < M; i++) { foo (A, B); for (j = 0; j < N; j++) { Out[i][j] = In[i][j] + (NValue*j); } } return N; } [C:/temp] icl -c -DCOSTLOW -Qparallel -Qpar-report2 -Qansi-alias v.cpp C:\temp\v.cpp(28): (col. 3) remark: loop was not parallelized: insufficient computational work. [C:/temp] icl -c -DCOSTHIGH -Qparallel -Qpar-report -Qansi-alias v.cpp C:\temp\v.cpp(28): (col. 3) remark: LOOP WAS AUTO-PARALLELIZED. [C:/temp] icl -c -DPROFITABLE -Qparallel -Qpar-report -Qansi-alias v.cpp C:\temp\v.cpp(28): (col. 3) remark: LOOP WAS AUTO-PARALLELIZED. |