Interprocedural Optimization (IPO) Overview

Interprocedural Optimization (IPO) is an automatic, multi-step process that allows the compiler to analyze your code to determine where you can benefit from specific optimizations. When you use IPO options along with the -x or -ax (Linux* OS) options, or the /Qx or /Qax (Windows* OS) options, you may see additional optimizations for Intel microprocessors than for non-Intel microprocessors.

The compiler may apply the following optimizations for the listed architectures:

Architecture	Optimization
All supported Intel® architectures	inlining constant propagation mod/ref analysis alias analysis forward substitution routine key-attribute propagation address-taken analysis partial dead call elimination symbol table data promotion common block variable coalescing dead function elimination unreferenced variable removal whole program analysis array dimension padding common block splitting stack frame alignment structure splitting and field reordering formal parameter alignment analysis C++ class hierarchy analysis indirect call conversion specialization
IA-32 and Intel® 64 architectures	Passing arguments in registers to optimize calls and register usage

Architecture

Optimization

All supported Intel® architectures

inlining
constant propagation
mod/ref analysis
alias analysis
forward substitution
routine key-attribute propagation
address-taken analysis
partial dead call elimination
symbol table data promotion
common block variable coalescing
dead function elimination
unreferenced variable removal
whole program analysis
array dimension padding
common block splitting
stack frame alignment
structure splitting and field reordering
formal parameter alignment analysis
C++ class hierarchy analysis
indirect call conversion
specialization

IA-32 and Intel® 64 architectures

Passing arguments in registers to optimize calls and register usage

IPO Compilation Models

IPO supports two compilation models - single-file compilation and multi-file compilation.

Single-file compilation uses the -ip (Linux* OS and Mac OS* X) or /Qip (Windows* OS) option, and results in one, real object file for each source file being compiled. During single-file compilation the compiler performs inline function expansion for calls to procedures defined within the current source file.

The compiler performs some single-file interprocedural optimization at the default optimization level: -O2 (Linux* and Mac OS* X) or /O2 (Windows*); additionally the compiler may perform some inlining for the -O1 (Linux* and Mac OS* X) or /O1 (Windows*) optimization level, like inlining functions marked with inlining pragmas or attributes (GNU C and C++) and C++ class member functions with bodies included in the class declaration.

Multi-file compilation uses the -ipo (Linux and Mac OS X) or /Qipo (Windows) option, and results in one or more mock object files rather than normal object files. (See the Compilation section below for information about mock object files.) Additionally, the compiler collects information from the individual source files that make up the program. Using this information, the compiler performs optimizations across functions and procedures in different source files. See Inline Function Expansion.

Note

Inlining and other optimizations are improved by profile information. For a description of how to use IPO with profile information for further optimization, see Profile an Application.

Mac OS* X: Intel®-based systems running Mac OS X do not support a multiple object compilation model.

Compiling with IPO

As each source file is compiled with IPO, the compiler stores an intermediate representation (IR) of the source code in a mock object file that includes summary information used for optimization. The mock object files contain the IR instead of the normal object code. Mock object files can be ten times or more larger than the size of normal object files.

During the IPO compilation phase only the mock object files are visible. The Intel compiler does not expose the real object files during IPO unless you also specify the -ipo-c (Linux and Mac OS X) or /Qipo-c (Windows) option.

Linking with IPO

When you link with the -ipo (Linux and Mac OS X) or /Qipo (Windows) option the compiler is invoked a final time. The compiler performs IPO across all object files that have an IR equivalent. The mock objects must be linked with the Intel compiler or by using the Intel linking tools. The compiler calls the linkers indirectly by using aliases (or wrappers) for the native linkers, so you must modify make files to account for the different linking tool names. For information on using the linking tools, see Using IPO; see the Linking Tools and Options topic for detailed information.

Caution

Linking the mock object files with ld (Linux and Mac OS X) or link.exe (Windows) will cause linkage errors. You must use the Intel linking tools to link mock object files.

During the compilation process, the compiler first analyzes the summary information and then produces mock object files for source files of the application.

Whole Program Analysis

The compiler supports a large number of IPO optimizations that can be applied or have its effectiveness greatly increased when the whole program condition is satisfied.

Whole program analysis, when it can be done, enables many interprocedural optimizations. During the analysis process, the compiler reads all Intermediate Representation (IR) in the mock file, object files, and library files to determine if all references are resolved and whether or not a given symbol is defined in a mock object file. Symbols that are included in the IR in a mock object file for both data and functions are candidates for manipulation based on the results of whole program analysis.

There are two types of whole program analysis - object reader method and table method. Most optimizations can be applied if either type of whole program analysis determine that the whole program conditions exists; however, some optimizations require the results of the object reader method, and some optimizations require the results of table method.

Object reader method

In the object reader method, the object reader emulates the behavior of the native linker and attempts to resolve the symbols in the application. If all symbols are resolved correctly, the whole program condition is satisfied. This type of whole program analysis is more likely to detect the whole program condition.

Often the object files and libraries accessed by the compiler do not represent the whole program; there are many dependencies to well-known libraries. IPO linking, whole program analysis, determines whether or not the whole program can be detected using the available compiler resources.

Table method

In the table method the compiler analyzes the mock object files and generates a call-graph.

The compiler contains detailed tables about all of the functions for all important language-specific libraries, like libc. In this second method, the compiler constructs a call-graph for the application. The compiler then compares the function table and application call-graph. For each unresolved function in the call-graph, the compiler attempts to resolve the calls. If the compiler can resolve the functions call, the whole program condition exists.