This example illustrates use of reducers in accumulating a sum in parallel. Consider the following serial program, which repeatedly calls a compute() function and accumulates the answers into the total variable.
#include <iostream>
unsigned int compute(unsigned int i)
{
return i; // return a value computed from i
}
int main(int argc, char* argv[])
{
unsigned int n = 1000000;
unsigned int total = 0;
// Compute the sum of integers 1..n
for(unsigned int i = 1; i <= n; ++i)
{
total += compute(i);
}
// the sum of the first n integers should be n * (n+1) / 2
unsigned int correct = (n * (n+1)) / 2;
if (total == correct)
std::cout << "Total (" << total
<< ") is correct" << std::endl;
else
std::cout << "Total (" << total
<< ") is WRONG, should be "
<< correct << std::endl;
return 0;
}
Converting this program to an Intel® Cilk™ Plus program and changing the for to a cilk_for causes the loop to run in parallel, but creates a data race on the total variable. To resolve the race, you can make total a reducer; specifically, a reducer_opadd, defined for types that have an associative + operator. The changes are shown below.
#include <cilk/cilk.h>
#include <cilk/reducer_opadd.h>
#include <iostream>
unsigned int compute(unsigned int i)
{
return i; // return a value computed from i
}
int main(int argc, char* argv[])
{
unsigned int n = 1000000;
cilk::reducer_opadd<unsigned int> total;
// Compute 1..n
cilk_for(unsigned int i = 1; i <= n; ++i)
{
total += compute(i);
}
// the sum of the first N integers should be n * (n+1) / 2
unsigned int correct = (n * (n+1)) / 2;
if ( total.get_value() == correct)
std::cout << "Total (" << total.get_value()
<< ") is correct" << std::endl;
else
std::cout << "Total (" << total.get_value()
<< ") is WRONG, should be "
<< correct << std::endl;
return 0;
}
The following changes in the serial code show how to use a reducer:
Include the appropriate reducer header file.
Declare the reduction variable as a reducer_kind<TYPE> rather than as a TYPE.
Introduce parallelism, in this case by changing the for loop to a cilk_for loop.
Retrieve the reducer's terminal value with the get_value() method after the cilk_for loop is complete.
NoteCopyright © 1996-2011, Intel Corporation. All rights reserved.