Overview of Parallelism Method

The three major features of parallel programming supported by the Intel® compiler include:

OpenMP*
Auto-parallelization
Auto-vectorization

Each of these features contributes to application performance depending on the number of processors, target architecture (IA-32, Intel® 64, and IA-64 architectures), and the nature of the application. These features of parallel programming can be combined to contribute to application performance.

Parallelism defined with the OpenMP* API is based on thread-level and task-level parallelism. Parallelism defined with auto-parallelization techniques is based on thread-level parallelism (TLP). Parallelism defined with auto-vectorization techniques is based on instruction-level parallelism (ILP).

Parallel programming can be explicit, that is, defined by a programmer using the OpenMP* API and associate options. Parallel programming can also be implicit, that is, detected automatically by the compiler. Implicit parallelism implements auto-parallelization of outer-most loops and auto-vectorization of innermost loops (or both).

To enhance the compilation of the code with auto-vectorization, users can also add vectorizer directives to their program.

Note

Software pipelining (SWP), a technique closely related to auto-vectorization, is available on systems based on IA-64 architecture.

The following table summarizes the different ways in which parallelism can be exploited with the Intel® Compiler.

Intel provides performance libraries that contain highly optimized, extensively threaded routines, including the Intel® Math Kernel Library (Intel® MKL).

In addition to these major features supported by the Intel compiler, certain operating systems support application program interface (API) function calls that provide explicit threading controls. For example, Windows* operating systems support API calls such as CreateThread, and multiple operating systems support POSIX* threading APIs.

Parallelism Method	Supported On
Implicit (parallelism generated by the compiler and by user-supplied hints)

Auto-parallelization (Thread-Level Parallelism)	IA-32 architecture, Intel® 64 architecture, IA-64 architecture based multi-processor systems, and multi-core processors Hyper-Threading Technology-enabled systems
Auto-vectorization (Instruction-Level Parallelism)	Pentium®, Pentium with MMX™ Technology, Pentium II, Pentium III, Pentium 4 processors, Intel® Core™ processor, and Intel® Core™ 2 processor.
Explicit (parallelism programmed by the user)
OpenMP* (Thread-Level and Task-Level Parallelism)	IA-32 architecture, Intel® 64 architecture, IA-64 architecture-based multiprocessor systems, and multi-core processors Hyper-Threading Technology-enabled systems

Threading Resources

For general information about threading an existing serial application or design considerations for creating new threaded applications, see Other Resources and the web site http://go-parallel.com.