Intel® C++ Compiler 16.0 User and Reference Guide
When developing your program, be aware of the following safety, correctness, and performance considerations.
The Reducer and the View
A reducer manages instances of a view; code in a parallel reduction computation updates the view instance for the current parallel strand. The current view instance is accessed by dereferencing the reducer: *r = *r OP a.
In other words, you can think of the reducer as being like a (smart) pointer to the view.
Safety
If a properly written reducer is used as described above, no data races will occur between accesses to the reducer's view in different parallel strands.
Floating-Point Operations
Floating-point arithmetic operations are not strictly associative because of overflow, underflow, and loss of precision resulting from the way that floating-point numbers are represented in a computer. This is a consideration that potentially affects any optimization that reassociates floating-point arithmetic, including reducers.
Before using a floating-point reducer, you should carefully consider the characteristics of your data and what the possible effects on the results may be.
Initial and Final Values
You can specify an initial value for a reducer with an argument to the reducer constructor, or you can set its initial value by calling the reducer set_value() function. If you do not give a reducer an initial value, its identity value will be used.
You can retrieve the final value of the computation by calling the reducer get_value() function after the computation is complete.
Alternatively, you can move the value of a variable into the reducer, leaving the variable undefined, but calling the reducer move_in() function, or move the value of the reducer out into a variable, leaving the reducer value undefined, by calling the reducer move_out() function.
get_value(),set_value(), move_in(), and move_out() always access the view instance of the current strand. You can call them in the middle of a parallel computation, but the results will almost always be meaningless. The best policy is to use set_value() and move_in() only to initialize a reducer, and get_value() and move_out() only to retrieve its final value.
Permitted Operations
A reducer computes the correct result in a parallel computation (in other words, the same result as the serial computation) because its reduction operation is associative. Reassociating a non-associative computation will lead to undesired results. In other words, the only modifications that are permitted on a reducer's view are updates using its reduction operation ( view = view OP value ), or something that is semantically equivalent. (For example, permissible operations on an addition reducer include view = view + value , view += value (equivalent to view = view + value ), view++ (equivalent to view = view + 1 ), and view = view - value (equivalent to view = view + (- value)) .)
All library reducer class views enforce this restriction: view = view * value or view = 3 won't even compile if view is the view of an addition reducer. Custom reducer classes may or may not enforce similar restrictions. If you use a reducer class that does not syntactically restrict the operations you can perform on its view, then it is up to you to understand and follow its operator restrictions.
Performance
When used judiciously, reducers can incur little or no runtime performance cost. However, there are some performance considerations to keep in mind:
The efficiency of reducers relies on the assumption that creating and merging view instances is cheap. If the identity and reduce operations do not have small constant execution times, then the performance benefits of parallel execution may be lost in the overhead of view management.
There is some overhead at a cilk_sync for every reducer in the strands being synched, regardless of whether new views instances were created for them. This can be important if you use many reducers (for example, a large array of reducers).