The GAP Report in this example recommends using the -parallel option to enable parallelization. From the command-line, execute make gap_par_report, or run the following:
ifort -c -parallel -guide scalar_dep.f90
The compiler emits the following:
GAP REPORT LOG OPENED ON Mon Aug 2 14:04:44 2010 scalar_dep.f90(44): remark #30523: (PAR) Loop at line 44 cannot be parallelized due to conditional assignment(s) into the following variable(s): t. This loop will be parallelized if the variable(s) become unconditionally initialized at the top of every iteration. [VERIFY] Make sure that the value(s) of the variable(s) read in any iteration of the loop must have been written earlier in the same iteration. [ALTERNATIVE] Another way is to use "!dir$ parallel private(t)" to parallelize the loop. [VERIFY] The same conditions described previously must hold. scalar_dep.f90(44): remark #30525: (PAR) If the trip count of the loop at line 44 is greater than 36, then use "!dir$ loop count min(36)" to parallelize this loop. [VERIFY] Make sure that the loop has a minimum of 36 iterations. Number of advice-messages emitted for this compilation session: 2. END OF GAP REPORT LOG
In the GAP Report, remark #30523 indicates that loop at line 44 cannot parallelize because the variable t is conditionally assigned. Remark #30525 indicates that the loop trip count must be greater than 36 for the compiler to parallelize the loop.
Apply the necessary changes after verifying that the GAP recommendations are appropriate and do not change the semantics of the program.
For this loop, the conditional compilation enables parallelization and vectorization of the loop as recommended by GAP:
do i = 1, n !dir$ if defined(test_gap) t = i !dir$else if (a(i) >= 0) then t = i end if !dir$ endif if (a(i) > 0) then a(i) = t * (1 / (a(i) * a(i))) end if end do
To verify that the loop is parallelized and vectorized:
Add the compiler options -vec-report1 -par-report1.
Add the conditional definition test_gap to compile the appropriate code path.
From the command-line, execute make w_changes, or run the following:
ifort -c -parallel -Dtest_gap -vec-report1 -par-report1 scalar_dep.f90
The compiler's -vec-report and -par-report options emit the following output, confirming that the program is vectorized and parallelized:
scalar_dep.f90(44) (col. 9): remark: LOOP WAS AUTO-PARALLELIZED. scalar_dep.f90(44) (col. 9): remark: LOOP WAS VECTORIZED. scalar_dep.f90(44) (col. 9): remark: LOOP WAS VECTORIZED.
For more information on using the -guide, -vec-report, and -par-report compiler options, see the Compiler Options section in the Compiler User Guide and Reference.
This completes the tutorial for Guided Auto-parallelization, where you have seen how the compiler can guide you to an optimized solution through auto-parallelization.
Copyright © 2011, Intel Corporation. All rights reserved.