Implementing Guided Auto-parallelization Recommendations

The GAP Report in this example recommends using the -parallel option to enable parallelization. From the command-line, execute make gap_par_report, or run the following:

ifort -c -parallel -guide scalar_dep.f90

The compiler emits the following:

GAP REPORT LOG OPENED ON Mon Aug  2 14:04:44 2010
scalar_dep.f90(44): remark #30523: (PAR) Loop at line 44 cannot be parallelized due
to conditional assignment(s) into the following variable(s): t. This loop will be
parallelized if the variable(s) become unconditionally initialized at the top of every
iteration. [VERIFY] Make sure that the value(s) of the variable(s) read in any iteration
of the loop must have been written earlier in the same iteration.
[ALTERNATIVE] Another way is to use "!dir$ parallel private(t)" to parallelize the loop.
[VERIFY] The same conditions described previously must hold.
scalar_dep.f90(44): remark #30525: (PAR) If the trip count of the loop at line 44 is
greater than 36, then use "!dir$ loop count min(36)" to parallelize this loop.
[VERIFY] Make sure that the loop has a minimum of 36 iterations.
Number of advice-messages emitted for this compilation session: 2.
END OF GAP REPORT LOG

In the GAP Report, remark #30523 indicates that loop at line 44 cannot parallelize because the variable t is conditionally assigned. Remark #30525 indicates that the loop trip count must be greater than 36 for the compiler to parallelize the loop.

Apply the necessary changes after verifying that the GAP recommendations are appropriate and do not change the semantics of the program.

For this loop, the conditional compilation enables parallelization and vectorization of the loop as recommended by GAP:

        do i = 1, n
!dir$ if defined(test_gap)
            t = i
!dir$else
            if (a(i) >= 0) then
                t = i
            end if
!dir$ endif
            if (a(i) > 0) then
                a(i) = t * (1 / (a(i) * a(i)))
            end if
        end do

To verify that the loop is parallelized and vectorized:

Add the compiler options -vec-report1 -par-report1.
Add the conditional definition test_gap to compile the appropriate code path.

From the command-line, execute make w_changes, or run the following:

ifort -c -parallel -Dtest_gap -vec-report1 -par-report1 scalar_dep.f90

The compiler's -vec-report and -par-report options emit the following output, confirming that the program is vectorized and parallelized:

scalar_dep.f90(44) (col. 9): remark: LOOP WAS AUTO-PARALLELIZED.
scalar_dep.f90(44) (col. 9): remark: LOOP WAS VECTORIZED.
scalar_dep.f90(44) (col. 9): remark: LOOP WAS VECTORIZED.

For more information on using the -guide, -vec-report, and -par-report compiler options, see the Compiler Options section in the Compiler User Guide and Reference.

This completes the tutorial for Guided Auto-parallelization, where you have seen how the compiler can guide you to an optimized solution through auto-parallelization.