The three major features of parallel programming supported by the Intel® compiler include:
OpenMP*
Auto-parallelization
Auto-vectorization
Each of these features contributes to application performance depending on the number of processors, target architecture (IA-32, Intel® 64, and IA-64 architectures), and the nature of the application. These features of parallel programming can be combined to contribute to application performance.
Parallelism defined with the OpenMP* API is based on thread-level and task-level parallelism. Parallelism defined with auto-parallelization techniques is based on thread-level parallelism (TLP). Parallelism defined with auto-vectorization techniques is based on instruction-level parallelism (ILP).
Parallel programming can be explicit, that is, defined by a programmer using the OpenMP* API and associate options. Parallel programming can also be implicit, that is, detected automatically by the compiler. Implicit parallelism implements auto-parallelization of outer-most loops and auto-vectorization of innermost loops (or both).
To enhance the compilation of the code with auto-vectorization, users can also add vectorizer directives to their program.
Software pipelining (SWP), a technique closely related to auto-vectorization, is available on systems based on IA-64 architecture.
The following table summarizes the different ways in which parallelism can be exploited with the Intel® Compiler.
Intel provides performance libraries that contain highly optimized, extensively threaded routines, including the Intel® Math Kernel Library (Intel® MKL).
In addition to these major features supported by the Intel compiler, certain operating systems support application program interface (API) function calls that provide explicit threading controls. For example, Windows* operating systems support API calls such as CreateThread, and multiple operating systems support POSIX* threading APIs.
Parallelism Method |
Supported On |
---|---|
Implicit (parallelism generated by the compiler and by user-supplied hints) |
|
Auto-parallelization |
|
Auto-vectorization |
|
Explicit (parallelism programmed by the user) |
|
OpenMP* (Thread-Level and Task-Level Parallelism) |
|
For general information about threading an existing serial application or design considerations for creating new threaded applications, see Other Resources and the web site http://go-parallel.com.