simd

The simd pragma enforces vectorization of loops.

Syntax

#pragma simd [clause[ [,] clause]...]

Arguments

clause

clause can be any of the following:

vectorlength(`n1`[, `n2`]...)	n is a vector length (VL). It must be an integer that is a power of 2; the value must be 2, 4, 8, or 16. If you specify more than one n, the vectorizor will choose the VL from the values specified. Causes each iteration in the vector loop to execute the computation equivalent to n iterations of scalar loop execution. Multiple `vectorlength` clauses are merged as a union.
vectorlengthfor(`data type`)	data type must be one of built-in integer types (8, 16, 32, or 64bit), pointer types (treated as pointer-sized integer), floating point types (32 or 64bit), or complex types (64bit or 128bit). Otherwise, behavior is undefined. Causes each iteration in the vector loop to execute the computation equivalent to n iterations of scalar loop execution where n is computed from `size_of_vector_register`/`sizeof(data_type)`. For example, `vectorlengthfor(float)` results in `n=4` for SSE2 to SSE4.2 targets (packed float operations available on 128bit XMM registers) and `n=8` for AVX target (packed float operations available on 256bit YMM registers). `vectorlengthfor(int)` results in `n=4` for SSE2 to AVX targets. `vectorlength()` and `vectorlengthfor()` clauses are mutually exclusive. In other words, the `vectorlengthfor()` clause may not be used with the `vectorlength()` clause, and vice versa. Behavior for multiple `vectorlengthfor` clauses is undefined.
private(`var1`[, `var2`]...)	var is a scalar variable. Causes each variable to be private to each iteration of a loop. Unless the variable appears in `firstprivate` clause, the initial value of the variable for the particular iteration is undefined. Unless the variable appears in `lastprivate` clause, the value of the variable upon exit of the loop is undefined. Multiple `private` clauses are merged as a union. Note Execution of the SIMD loop with `firtsprivate`/`lastprivate` clauses may be different from serial execution of the same code even if the loop fails to vectorize.
firstprivate(`var1`[, `var2`]...)	Provides a superset of the functionality provided by the `private` clause. Variables that appear in a `firstprivate` list are subject to `private` clause semantics. In addition, its initial value is broadcast to all private instances for each iteration upon entering the SIMD loop.
lastprivate(`var1`[, `var2`]...)	Provides a superset of the functionality provided by the `private` clause. Variables that appear in a `lastprivate` list are subject to `private` clause semantics. In addition, when the SIMD loop is exited, each variable has the value that resulted from the sequentially last iteration of the SIMD loop (which may be undefined if the last iteration does not assign to the variable).
linear(`var1:step1` [`,var2:step2`]...)	var is a scalar variable. step is a compile-time positive, integer constant expression. For each iteration of a scalar loop, var1 is incremented by step1, var2 is incremented by step2, and so on. Therefore, every iteration of the vector loop increments the variables by VLstep1, VLstep2, …, to VLstepN, respectively. If more than one step is specified for a var*, a compile-time error occurs. Multiple linear clauses are merged as a union.
reduction(`oper:var1` [,`var2`]…)	oper is a reduction operator. var is a scalar variable. Applies the vector reduction indicated by oper to var1, var2, …, varN. The `simd` pragma may have multiple reduction clauses with the same or different operators. If more than one reduction operator is associated with a var, a compile-time error occurs.
[no]assert	Directs the compiler to assert or not to assert when the vectorization fails. The default is `noassert`. If this clause is specified more than once, a compile-time error occurs.

Description

The simd pragma is used to guide the compiler to vectorize more loops. Vectorization using the simd pragma complements (but does not replace) the fully automatic approach.

Without explicit vectorlength() and vectorlengthfor() clauses, compiler will choose a vectorlength using its own cost model. Misclassification of variables into private, firstprivate, lastprivate, linear, and reduction, or lack of appropriate classification of variables may lead to unintended consequences such as runtime failures and/or incorrect result.

You can only specify a particular variable in at most one instance of a private, linear, or reduction clause.

If the compiler is unable to vectorize a loop, a warning will be emitted (use assert clause to make it an error).

If the vectorizer has to stop vectorizing a loop for some reason, the fast floating-point model is used for the SIMD loop.

Note that the simd pragma may not affect all auto-vectorizable loops. Some of these loops do not have a way to describe the SIMD vector semantics.

The following restrictions apply to the simd pragma:

The countable loop for the simd pragma has to conform to the for-loop style of an OpenMP worksharing loop construct. Additionally, the loop control variable must be a signed integer type.
The vector values must be signed 8-, 16-, 32-, or 64-bit integers, single or double-precision floating point numbers, or single or double-precision complex numbers.
A SIMD loop may contain another loop (for, while, do-while) in it. Goto out of such inner loops are not supported. Break and continue are supported. Note that inlining can create such an inner loop, which may not be obvious at the source level.
A SIMD loop performs memory references unconditionally. Therefore, all address computations must result in valid memory addresses, even though such locations may not be accessed if the loop is executed sequentially.

To disable transformations that enables more vectorization, specify options -no-vec -no-simd (Linux* and Mac OS* X) or /Qvec- /Qsimd- (Windows*)

User-mandated vectorization, also called SIMD vectorization can assert or not assert an error if a #pragma simd annotated loop fails to vectorize. By default #pragma simd is set to noassert, and the compiler will issue a warning if the loop fails to vectorize. To direct the compiler to assert an error when the #pragma simd annotated loop fails to vectorize, add the assert clause to the #pragma simd. If a #pragma simd annotated loop is not vectorized by the compiler, the loop holds its serial semantics.

Example

Using #pragma simd.

In the example, the function add_floats() uses too many unknown pointers for the compiler's automatic runtime independence check optimization to kick-in. The programmer can enforce the vectorization of this loop by using #pragma simd and avoid the overhead of runtime check:

 void add_floats(float *a, float *b, float *c, float *d, float *e, int n){

  int i;

#pragma simd

  for (i=0; i<n; i++){

    a[i] = a[i] + b[i] + c[i] + d[i] + e[i];

simd

Syntax

Arguments

Note

Description

Example

See Also