Intel® C++ Compiler 16.0 User and Reference Guide

simd

Enforces vectorization of loops.

Syntax

#pragma simd [clause[ [,] clause]...]

Arguments

clause

Can be any of the following:

vectorlength(n1[, n2]...)

Where n is a vector length (VL). It must be an integer that is a power of 2; the value must be 2, 4, 8, or 16. If you specify more than one n, the vectorizor will choose the VL from the values specified.

Causes each iteration in the vector loop to execute the computation equivalent to n iterations of scalar loop execution. Multiple vectorlength clauses are merged as a union.

vectorlengthfor(data type)

Where data type must be one of built-in integer types (8, 16, 32, or 64bit), pointer types (treated as pointer-sized integer), floating point types (32 or 64bit), or complex types (64bit or 128bit). Otherwise, behavior is undefined.

Causes each iteration in the vector loop to execute the computation equivalent to n iterations of scalar loop execution where n is computed from size_of_vector_register/sizeof(data_type).

For example, vectorlengthfor(float) results in n=4 for SSE2 to SSE4.2 targets (packed float operations available on 128bit XMM registers) and n=8 for AVX target (packed float operations available on 256bit YMM registers). vectorlengthfor(int) results in n=4 for SSE2 to AVX targets.

vectorlength() and vectorlengthfor() clauses are mutually exclusive. In other words, the vectorlengthfor() clause may not be used with the vectorlength() clause, and vice versa.

Behavior for multiple vectorlengthfor clauses is undefined.

private(var1[, var2]...)

Where var is a scalar variable.

Causes each variable to be private to each iteration of a loop. Unless the variable appears in firstprivate clause, the initial value of the variable for the particular iteration is undefined. Unless the variable appears in lastprivate clause, the value of the variable upon exit of the loop is undefined. Multiple private clauses are merged as a union.

Note

Execution of the SIMD loop with firtsprivate/lastprivate clauses may be different from serial execution of the same code even if the loop fails to vectorize.

A variable in a private clause cannot appear in a linear, reduction, firstprivate, or lastprivate clause.

firstprivate(var1[, var2]...)

Provides a superset of the functionality provided by the private clause. Variables that appear in a firstprivate list are subject to private clause semantics. In addition, its initial value is broadcast to all private instances for each iteration upon entering the SIMD loop.

A variable in a firstprivate clause can appear in a lastprivate clause.

A variable in a firstprivate clause cannot appear in a linear, reduction, or private clause.

lastprivate(var1[, var2]...)

Provides a superset of the functionality provided by the private clause. Variables that appear in a lastprivate list are subject to private clause semantics. In addition, when the SIMD loop is exited, each variable has the value that resulted from the sequentially last iteration of the SIMD loop (which may be undefined if the last iteration does not assign to the variable).

A variable in a lastprivate clause can appear in a firstprivate clause.

A variable in a lastprivate clause cannot appear in a linear, reduction, or private clause.

linear(var1:step1 [,var2:step2]...)

Where var is a scalar variable and step is a compile-time positive, integer constant expression.

For each iteration of a scalar loop, var1 is incremented by step1, var2 is incremented by step2, and so on. Therefore, every iteration of the vector loop increments the variables by VL*step1, VL*step2, …, to VL*stepN, respectively. If more than one step is specified for a var, a compile-time error occurs. Multiple linear clauses are merged as a union.

A variable in a linear clause cannot appear in a reduction, private, firstprivate, or lastprivate clause.

reduction(oper:var1 [,var2]…)

Where oper is a reduction operator and var is a scalar variable.

Applies the vector reduction indicated by oper to var1, var2, …, varN. The simd pragma may have multiple reduction clauses with the same or different operators. If more than one reduction operator is associated with a var, a compile-time error occurs.

A variable in a reduction clause cannot appear in a linear, private, firstprivate, or lastprivate clause.

[no]assert

Directs the compiler to assert or not to assert when the vectorization fails. The default is noassert. If this clause is specified more than once, a compile-time error occurs.

[no]vecremainder

Instructs the compiler to vectorize or not to vectorize the remainder loop when the original loop is vectorized. See the description of the vector pragma for more information.

Description

The simd pragma is used to guide the compiler to vectorize more loops. Vectorization using the simd pragma complements (but does not replace) the fully automatic approach.

The simd pragma can be used on a cilk_for loop. See the documentation on cilk_for for a discussion of how they are best used together.

Without explicit vectorlength() and vectorlengthfor() clauses, the compiler will choose a vectorlength using its own cost model. Misclassification of variables into private, firstprivate, lastprivate, linear, and reduction, or lack of appropriate classification of variables may cause unintended consequences such as runtime failures and/or incorrect result.

You can only specify a particular variable in at most one instance of a private, linear, or reduction clause.

If the compiler is unable to vectorize a loop, a warning will be emitted (use the assert clause to make it an error).

If the vectorizer has to stop vectorizing a loop for some reason, the fast floating-point model is used for the SIMD loop.

The vectorization performed on this loop by the simd pragma overrides any setting you may specify for options -fp-model (Linux* and OS X*) and /fp (Windows*) for this loop.

Note that the simd pragma may not affect all auto-vectorizable loops. Some of these loops do not have a way to describe the SIMD vector semantics.

The following restrictions apply to the simd pragma:

To disable transformations that enables more vectorization, specify the -vec -no-simd (Linux* and OS X*) or /Qvec /Qno-simd (Windows*) options.

User-mandated vectorization, also called SIMD vectorization can assert or not assert an error if a #pragma simd annotated loop fails to vectorize. By default, the simd pragma is set to noassert, and the compiler will issue a warning if the loop fails to vectorize. To direct the compiler to assert an error when the #pragma simd annotated loop fails to vectorize, add the assert clause to the simd pragma. If a simd pragma annotated loop is not vectorized by the compiler, the loop holds its serial semantics.

Example: Using the simd pragma

 void add_floats(float *a, float *b, float *c, float *d, float *e, int n){
  int i; 
#pragma simd
  for (i=0; i<n; i++){
    a[i] = a[i] + b[i] + c[i] + d[i] + e[i];
  } 
}

In the example above, the function add_floats() uses too many unknown pointers for the compiler's automatic runtime independence check optimization to kick-in. The programmer can enforce the vectorization of this loop by using the simd pragma to avoid the overhead of runtime check.

See Also