simd

Enforces vectorization of loops.

Syntax

#pragma simd [clause[ [,] clause]...]

Arguments

clause

Can be any of the following:

vectorlength(`n1`[, `n2`]...)	Where n is a vector length (VL). It must be an integer that is a power of 2; the value must be 2, 4, 8, or 16. If you specify more than one n, the vectorizor will choose the VL from the values specified. Causes each iteration in the vector loop to execute the computation equivalent to n iterations of scalar loop execution. Multiple `vectorlength` clauses are merged as a union.
vectorlengthfor(`data type`)	Where data type must be one of built-in integer types (8, 16, 32, or 64bit), pointer types (treated as pointer-sized integer), floating point types (32 or 64bit), or complex types (64bit or 128bit). Otherwise, behavior is undefined. Causes each iteration in the vector loop to execute the computation equivalent to n iterations of scalar loop execution where n is computed from `size_of_vector_register`/`sizeof(data_type)`. For example, `vectorlengthfor(float)` results in `n=4` for SSE2 to SSE4.2 targets (packed float operations available on 128bit XMM registers) and `n=8` for AVX target (packed float operations available on 256bit YMM registers). `vectorlengthfor(int)` results in `n=4` for SSE2 to AVX targets. `vectorlength()` and `vectorlengthfor()` clauses are mutually exclusive. In other words, the `vectorlengthfor()` clause may not be used with the `vectorlength()` clause, and vice versa. Behavior for multiple `vectorlengthfor` clauses is undefined.
private(`var1`[, `var2`]...)	Where var is a scalar variable. Causes each variable to be private to each iteration of a loop. Unless the variable appears in `firstprivate` clause, the initial value of the variable for the particular iteration is undefined. Unless the variable appears in `lastprivate` clause, the value of the variable upon exit of the loop is undefined. Multiple `private` clauses are merged as a union. Note Execution of the SIMD loop with `firtsprivate`/`lastprivate` clauses may be different from serial execution of the same code even if the loop fails to vectorize. A variable in a `private` clause cannot appear in a `linear`, `reduction`, `firstprivate`, or `lastprivate` clause.
firstprivate(`var1`[, `var2`]...)	Provides a superset of the functionality provided by the `private` clause. Variables that appear in a `firstprivate` list are subject to `private` clause semantics. In addition, its initial value is broadcast to all private instances for each iteration upon entering the SIMD loop. A variable in a `firstprivate` clause can appear in a `lastprivate` clause. A variable in a `firstprivate` clause cannot appear in a `linear`, `reduction`, or `private` clause.
lastprivate(`var1`[, `var2`]...)	Provides a superset of the functionality provided by the `private` clause. Variables that appear in a `lastprivate` list are subject to `private` clause semantics. In addition, when the SIMD loop is exited, each variable has the value that resulted from the sequentially last iteration of the SIMD loop (which may be undefined if the last iteration does not assign to the variable). A variable in a `lastprivate` clause can appear in a `firstprivate` clause. A variable in a `lastprivate` clause cannot appear in a `linear`, `reduction`, or `private` clause.
linear(`var1:step1` [`,var2:step2`]...)	Where var is a scalar variable and step is a compile-time positive, integer constant expression. For each iteration of a scalar loop, var1 is incremented by step1, var2 is incremented by step2, and so on. Therefore, every iteration of the vector loop increments the variables by VLstep1, VLstep2, …, to VLstepN, respectively. If more than one step is specified for a var*, a compile-time error occurs. Multiple linear clauses are merged as a union. A variable in a `linear` clause cannot appear in a `reduction`, `private`, `firstprivate`, or `lastprivate` clause.
reduction(`oper:var1` [,`var2`]…)	Where oper is a reduction operator and var is a scalar variable. Applies the vector reduction indicated by oper to var1, var2, …, varN. The `simd` pragma may have multiple reduction clauses with the same or different operators. If more than one reduction operator is associated with a var, a compile-time error occurs. A variable in a `reduction` clause cannot appear in a `linear`, `private`, `firstprivate`, or `lastprivate` clause.
[no]assert	Directs the compiler to assert or not to assert when the vectorization fails. The default is `noassert`. If this clause is specified more than once, a compile-time error occurs.
[no]vecremainder	Instructs the compiler to vectorize or not to vectorize the remainder loop when the original loop is vectorized. See the description of the vector pragma for more information.

Description

The simd pragma is used to guide the compiler to vectorize more loops. Vectorization using the simd pragma complements (but does not replace) the fully automatic approach.

The simd pragma can be used on a cilk_for loop. See the documentation on cilk_for for a discussion of how they are best used together.

Without explicit vectorlength() and vectorlengthfor() clauses, the compiler will choose a vectorlength using its own cost model. Misclassification of variables into private, firstprivate, lastprivate, linear, and reduction, or lack of appropriate classification of variables may cause unintended consequences such as runtime failures and/or incorrect result.

You can only specify a particular variable in at most one instance of a private, linear, or reduction clause.

If the compiler is unable to vectorize a loop, a warning will be emitted (use the assert clause to make it an error).

If the vectorizer has to stop vectorizing a loop for some reason, the fast floating-point model is used for the SIMD loop.

The vectorization performed on this loop by the simd pragma overrides any setting you may specify for options -fp-model (Linux* and OS X*) and /fp (Windows*) for this loop.

Note that the simd pragma may not affect all auto-vectorizable loops. Some of these loops do not have a way to describe the SIMD vector semantics.

The following restrictions apply to the simd pragma:

The countable loop for the simd pragma has to conform to the for-loop style of an OpenMP worksharing loop construct. Additionally, the loop control variable must be a signed integer type.
The vector values must be signed 8-, 16-, 32-, or 64-bit integers, single or double-precision floating point numbers, or single or double-precision complex numbers.
A SIMD loop may contain another loop (for, while, do-while) in it. Goto out of such inner loops are not supported. Break and continue are supported. Note that inlining can create such an inner loop, which may not be obvious at the source level.
A SIMD loop performs memory references unconditionally. Therefore, all address computations must result in valid memory addresses, even though such locations may not be accessed if the loop is executed sequentially

To disable transformations that enables more vectorization, specify the -vec -no-simd (Linux* and OS X*) or /Qvec /Qno-simd (Windows*) options.

User-mandated vectorization, also called SIMD vectorization can assert or not assert an error if a #pragma simd annotated loop fails to vectorize. By default, the simd pragma is set to noassert, and the compiler will issue a warning if the loop fails to vectorize. To direct the compiler to assert an error when the #pragma simd annotated loop fails to vectorize, add the assert clause to the simd pragma. If a simd pragma annotated loop is not vectorized by the compiler, the loop holds its serial semantics.

Example: Using the simd pragma
void add_floats(float a, float b, float c, float d, float *e, int n){ int i; #pragma simd for (i=0; i<n; i++){ a[i] = a[i] + b[i] + c[i] + d[i] + e[i]; } }

In the example above, the function add_floats() uses too many unknown pointers for the compiler's automatic runtime independence check optimization to kick-in. The programmer can enforce the vectorization of this loop by using the simd pragma to avoid the overhead of runtime check.

simd

Syntax

Arguments

Note

Description

See Also