High-level Optimizations (HLO) exploits the properties of source code constructs (for example, loops and arrays) in applications developed in high-level programming languages. While the default optimization level, -O2 (Linux* OS and Mac OS* X) or /O2 (Windows* OS) option, performs some high-level optimizations, specifying -O3 (Linux and Mac OS X) or /O3 (Windows) provides the best chance for performing loop transformations to optimize memory accesses.
Loop optimizations may result in calls to library routines that can result in additional performance gain on Intel® microprocessors than on non-Intel microprocessors. The optimizations performed can also be affected by certain options, such as /arch or /Qx (Windows) or -m or -x (Linux and Mac OS X), where, under the /Qx (Windows) or –x (Linux and Mac OS X) option, additional HLO transformations may be performed for Intel® microprocessors than for non-Intel microprocessors.
Within HLO, loop transformation techniques include:
Loop Permutation or Interchange
Loop Distribution
Loop Fusion
Loop Unrolling
Data Prefetching
Scalar Replacement
Unroll and Jam
Loop Blocking or Tiling
Partial-Sum Optimization
Predicate Optimization
Loop Reversal
Profile-Guided Loop Unrolling
Loop Peeling
Data Transformation: Malloc Combining and Memset Combining
Loop Rerolling
Memset and Memcpy Recognition
Statement Sinking for Creating Perfect Loopnests
Copyright © 1996-2011, Intel Corporation. All rights reserved.