Intel Extension Routines to OpenMP*

The Intel® Compiler implements the following group of routines as an extensions to the OpenMP* run-time library:

The Intel extension routines described in this section can be used for low-level tuning to verify that the library code and application are functioning as intended. These routines are generally not recognized by other OpenMP-compliant compilers, which may cause the link stage to fail in other compiler. These OpenMP routines require that you use the -openmp-stubs (Linux* and Mac OS* X) or /Qopenmp-stubs (Windows*) command-line option to execute.

See OpenMP* Run-time Library Routines for details about including support for these declarations in your source, and see OpenMP* Support Libraries for detailed information about execution environment (mode).

In most cases, environment variables can be used in place of the extension library routines. For example, the stack size of the parallel threads may be set using the OMP_STACKSIZE environment variable rather than the kmp_set_stacksize_s() library routine.

Note iconNote

A run-time call to an Intel extension routine takes precedence over the corresponding environment variable setting.

Execution Environment Routines

Function

Description

void kmp_set_defaults(char const *)

Sets OpenMP environment variables defined as a list of variables separated by "|" in the argument.

void kmp_set_library_throughput()

Sets execution mode to throughput, which is the default. Allows the application to determine the runtime environment. Use in multi-user environments.

void kmp_set_library_turnaround()

Sets execution mode to turnaround. Use in dedicated parallel (single user) environments.

void kmp_set_library_serial()

Sets execution mode to serial.

void kmp_set_library(int)

Sets execution mode indicated by the value passed to the function. Valid values are:

  • 1 - serial mode

  • 2 - turnaround mode

  • 3 - throughput mode

Call this routine before the first parallel region is executed.

int kmp_get_library()

Returns a value corresponding to the current execution mode: 1 (serial), 2 (turnaround), or 3 (throughput).

Stack Size

For IA-64 architecture it is recommended to always use kmp_set_stacksize_s() and kmp_get_stacksize_s(). The s() variants must be used if you need to set a stack size ≥ 2**31 bytes (2 gigabytes).

Function

Description

size_t kmp_get_stacksize_s()

Returns the number of bytes that will be allocated for each parallel thread to use as its private stack. This value can be changed with kmp_set_stacksize_s() routine, prior to the first parallel region or via the KMP_STACKSIZE environment variable.

int kmp_get_stacksize()

Provided for backwards compatibility only. Use kmp_get_stacksize_s() routine for compatibility across different families of Intel processors.

void kmp_set_stacksize_s(size_tsize)

Sets to size the number of bytes that will be allocated for each parallel thread to use as its private stack. This value can also be set via the KMP_STACKSIZE environment variable. In order for to have an effect, it must be called before the beginning of the first (dynamically executed) parallel region in the program.

void kmp_set_stacksize(int size)

Provided for backward compatibility only. Use kmp_set_stacksize_s() for compatibility across different families of Intel processors.

Memory Allocation

The Intel® compiler implements a group of memory allocation routines as an extension to the OpenMP* run-time library to enable threads to allocate memory from a heap local to each thread. These routines are: kmp_malloc(), kmp_calloc(), and kmp_realloc().

The memory allocated by these routines must also be freed by the kmp_free() routine. While it is legal for the memory to be allocated by one thread and freed by a different thread, this mode of operation has a slight performance penalty.

Function

Description

void* kmp_malloc(size_tsize)

Allocate memory block of size bytes from thread-local heap.

void* kmp_calloc(size_t nelem, size_t elsize)

Allocate array of nelem elements of size elsize from thread-local heap.

void* kmp_realloc(void*ptr, size_t size)

Reallocate memory block at address ptr and size bytes from thread-local heap.

void* kmp_free(void*ptr)

Free memory block at address ptr from thread-local heap.

Memory must have been previously allocated with kmp_malloc(), kmp_calloc(), or kmp_realloc().

Thread Sleep Time

In the throughput execution mode, threads wait for new parallel work at the ends of parallel regions, and then sleep, after a specified period of time. This time interval can be set by the KMP_BLOCKTIME environment variable or by the kmp_set_blocktime() function.

Function

Description

int kmp_get_blocktime(void)

Returns the number of milliseconds that a thread should wait, after completing the execution of a parallel region, before sleeping, as set either by the KMP_BLOCKTIME environment variable or by kmp_set_blocktime().

void kmp_set_blocktime(int msec)

Sets the number of milliseconds that a thread should wait, after completing the execution of a parallel region, before sleeping. This routine affects the block time setting for the calling thread and any OpenMP team threads formed by the calling thread. The routine does not affect the block time for any other threads.