Intel® C++ Compiler 16.0 User and Reference Guide

Using Reducers - A Simple Example

This example illustrates use of reducers in accumulating a sum in parallel. Consider the following serial program, which repeatedly calls a compute() function and accumulates the answers into the total variable.

#include <iostream>

unsigned int compute(unsigned int i)
{
   return i; // return a value computed from i
}

int main(int argc, char* argv[])
{
   unsigned long long int n = 1000000;
   unsigned long long int total = 0;

   // Compute the sum of integers 1..n
   for(unsigned int i = 1; i <= n; ++i)
   {
     total += compute(i);
   }

   // the sum of the first n integers should be n * (n+1) / 2
   unsigned long long int correct = (n * (n+1)) / 2;

   if (total == correct)
     std::cout << "Total (" << total
               << ") is correct" << std::endl;
   else
     std::cout << "Total (" << total
               << ") is WRONG, should be "
               << correct << std::endl;
   return 0;
}

Converting this program to an Intel® Cilk™ Plus program and changing the for to a cilk_for causes the loop to run in parallel, but creates a data race on the total variable. To resolve the race, you can make total a reducer; specifically, a reducer<op_add>, defined for types that have an associative + operator. The changes are shown below.

#include <cilk/cilk.h>
#include <cilk/reducer_opadd.h>
#include <iostream>

unsigned int compute(unsigned int i)
{
   return i; // return a value computed from i
}
int  main(int argc, char* argv[])
{
   unsigned long long int n = 1000000;
   cilk::reducer< cilk::op_add<unsigned long long int> > total (0);

   // Compute 1..n
   cilk_for(unsigned int i = 1; i <= n; ++i)
   {
     *total += compute(i);
   }

   // the sum of the first N integers should be n * (n+1) / 2
   unsigned long long int correct = (n * (n+1)) / 2;

   if ( total.get_value() == correct)
     std::cout << "Total (" <<  total.get_value()
               << ") is correct" << std::endl;
   else
     std::cout << "Total (" <<  total.get_value()
               << ") is WRONG, should be "
               << correct << std::endl;
   return 0;
}

The following changes in the serial code show how to use a reducer:

  1. Include the appropriate reducer header file (cilk/reducer_opadd.h).

  2. Declare the reduction variable as a reducer< op_kind<TYPE> > rather than as a TYPE.

  3. Introduce parallelism, in this case by changing the for loop to a cilk_for loop.

  4. In the parallel code, change references to the original variable to dereferences of the reducer variable (*total).

  5. Retrieve the reducer's final value after all parallel strands have synchronized; in this case, after the cilk_for loop is complete (total.get_value()).

Note

Reducers are C++ class objects that cannot be copied or assigned.