Intel® Fortran Compiler 16.0 User and Reference Guide

ATTRIBUTES VECTOR

The ATTRIBUTES directive option VECTOR tells the compiler to vectorize the specified function or subroutine.

!DIR$ ATTRIBUTES [att,] VECTOR [:clause] [, att]... :: routine-name

!DIR$ ATTRIBUTES [att,] VECTOR :(clause [, clause]...) [, att] :: routine-name

att

Is an ATTRIBUTES directive option. For a list of possible directive options, see the description of argument att in ATTRIBUTES.

clause

Is one or more of the following optional clauses:

  • LINEAR (var1:step1 [, var2:step2]... )

    var

    Is a scalar variable that is a dummy argument in the specified routine.

    step

    Is a compile-time positive, integer constant expression.

    Tells the compiler that for each consecutive invocation of the routine in a serial execution, the value of var1 is incremented by step1, var2 is incremented by step2, and so on.

    If more than one step is specified for a particular var, a compile-time error occurs.

    Multiple LINEAR clauses are merged as a union.

  • [NO]MASK

    Determines whether the compiler generates a masked vector version of the routine.

  • PROCESSOR (cpuid)

  • UNIFORM (arg [, arg]…)

    arg

    Is a scalar variable that is a dummy argument in the specified routine.

    Tells the compiler that the values of the specified arguments can be broadcasted to all iterations as a performance optimization.

    Multiple UNIFORM clauses are merged as a union.

  • VECTORLENGTH (n[, n]…)

    n

    Is a vector length (VL). It must be an integer, scalar constant expression that is a power of 2; the value must be 2, 4, 8, or 16. If you specify more than one n, the compiler will choose the VL from the values specified.

    Tells the compiler that each routine invocation at the call site should execute the computation equivalent to n times the scalar function execution.

    The VECTORLENGTH and VECTORLENGTHFOR clauses are mutually exclusive. You cannot use the VECTORLENGTH clause with the VECTORLENGTHFOR clause, and vice versa.

    Multiple VECTORLENGTH clauses cause a syntax error.

  • VECTORLENGTHFOR (data-type)

    data-type

    Is one of the following intrinsic data types:

    Data Type

    Fortran Intrinsic Type

    INTEGER

    Default INTEGER

    INTEGER(1)

    INTEGER (KIND=1)

    INTEGER(2)

    INTEGER (KIND=2)

    INTEGER(4)

    INTEGER (KIND=4)

    INTEGER(8)

    INTEGER (KIND=8)

    REAL

    Default REAL

    REAL(4)

    REAL (KIND=4)

    REAL(8)

    REAL (KIND=8)

    COMPLEX

    Default COMPLEX

    COMPLEX(4)

    COMPLEX (KIND=4)

    COMPLEX(8)

    COMPLEX (KIND=8)

    Causes each iteration in the vector loop to execute the computation equivalent to n iterations of scalar loop execution where n is computed from size_of_vector_register/sizeof(data_type).

    For example, VECTORLENGTHFOR (REAL (KIND=4)) results in n=4 for SSE2 to SSE4.2 targets (packed float operations available on 128-bit XMM registers) and n=8 for AVX target (packed float operations available on 256-bit YMM registers). VECTORLENGTHFOR(INTEGER (KIND=4)) results in n=4 for SSE2 to AVX targets.

    The VECTORLENGTHFOR and VECTORLENGTH clauses are mutually exclusive. You cannot use the VECTORLENGTHFOR clause with the VECTORLENGTH clause, and vice versa.

    Multiple VECTORLENGTHFOR clauses cause a syntax error.

    Without explicit VECTORLENGTH and VECTORLENGTHFOR clauses, the compiler will choose a VECTORLENGTH using its own cost model. Misclassification of variables into PRIVATE, FIRSTPRIVATE, LASTPRIVATE, LINEAR, and REDUCTION, or the lack of appropriate classification of variables, may lead to unintended consequences such as runtime failures and/or incorrect results.

routine-name

Is the name of a routine (a function or subroutine). It must be the enclosing routine or the routine immediately following the directive.

If you specify more than one clause, they must be separated by commas and enclosed in parentheses.

When you specify the ATTRIBUTES VECTOR directive, the compiler provides data parallel semantics by combining with the vectorized operations or loops at the call site. When multiple instances of the vector declaration are invoked in a parallel context, the execution order among them is not sequenced. If you specify one or more clauses, they affect the data parallel semantics provided by the compiler.

If you specify the ATTRIBUTES VECTOR directive with no VECTORLENGTH clause, a default VECTORLENGTH is computed based on efficiency heuristics of the vectorizer and the following:

If you do not explicitly specify a VECTORLENGTH clause, the compiler will choose a VECTORLENGTH using its own cost model.

If you specify the ATTRIBUTES VECTOR directive with no clause, the compiler will generate vector code based on compiler efficiency heuristics and whatever processor compiler options are specified.

The VECTOR attribute implies the C attribute, so that when you specify the VECTOR attribute on a routine, the C attribute is automatically also set on the same routine. This changes how the routine name is decorated and how arguments are passed.

Note

You should ensure that any possible side effects for the specified routine-name are acceptable or expected, and the memory access interferences are properly synchronized.

The Fortran Standard keyword ELEMENTAL specifies that a procedure written with scalar arguments can be extended to conforming array arguments by processing the array elements one at a time in any order. The ATTRIBUTES VECTOR directive tells the optimizer to produce versions of the procedure routine-name that execute with contiguous slices of the array arguments as defined by the VECTORLENGTH clause in an "elemental" fashion. routine-name does not need to be defined as ELEMENTAL to be given the VECTOR attribute.

The VECTOR attribute causes the compiler to generate a short vector form of the procedure, which can perform the procedure's operation on multiple elements of its array arguments in a single invocation. The short vector version may be able to perform multiple operations as fast as the regular implementation performs a single operation by using the vector instruction set in the CPU.

In addition, when invoked from an OMP construct, the compiler may assign different copies of the elemental procedures to different threads, executing them concurrently. The end result is that your data parallel operation executes on the CPU using both the parallelism available in the multiple cores and the parallelism available in the vector instruction set. If the short vector procedure is called inside a parallel loop or an auto-parallelized loop that is vectorized, you can achieve both vector-level and thread-level parallelism.

The INTENT(OUT) or INTENT(INOUT) attribute is not allowed for arguments of a procedure with the VECTOR attribute since the VECTOR attribute forces the procedure to receive its arguments by value.

The Intel C/C++ compiler built in function __intel_simd_lane() may be helpful in removing certain performance penalties caused by non-unit stride vector access. Consider the following:

interface
! returns a number between 0 and vectorlength – 1 that reflects the current “lane id” within the SIMD vector
! __intel_simd_lane() will return zero if the loop is not vectorized
    function for_simd_lane () bind (C, name = “__intel_simd_lane”) 
        integer (kind=4) :: for_simd_lane 
        !DEC$ attributes known_intrinsic, default :: for_simd_lane 
    end function for_simd_lane 
end interface

For more details, see the Intel C++ documentation.

Optimization Notice

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804

Example

The ATTRIBUTES VECTOR directive must be accessible in the caller, either via an INTERFACE block or by USE association.

The following shows an example of an external function with an INTERFACE block:


!... function definition

  function f(x)
 !dir$ attributes vector :: f
  real :: f, x
...
 !  attribute vector explicit in calling procedure using an INTERFACE

interface
   function f(x)
  !dir$ attributes vector :: f
  real :: f, x
  end
end interface
...
do i=1,n
  z(i) = f( x(i) )
end do

The ATTRIBUTES VECTOR directive can be brought into the caller by USE association if the vector function is a module procedure; for example:


!  attribute vector in definition of module procedure
 
module use_vect
 
contains
  function f(x)
  !dir$ attributes vector :: f
    real :: f, x
...
  end function
end module use_vect

!  USE and call of f(x) from another procedure with a module USE statement

 USE use_vect     !brings in ATTRIBUTE VECTOR for f(x)
...
 
! now simply call f(x)

do i=1,n
  z(i) = f( x(i) )
end do

You can specify more than one SCALAR or LINEAR clause in an ATTRIBUTES VECTOR directive. For example, all of the following are valid:


!DIR$ ATTRIBUTES VECTOR:PROCESSOR(atom) :: f
!DIR$ ATTRIBUTES VECTOR:(SCALAR(a), SCALAR(b)) :: f
!DIR$ ATTRIBUTES VECTOR:(LINEAR(x:1), LINEAR(y:1)) :: f

The three directives above are equivalent to specifying a single, continued, directive in fixed-form source, as follows:


!DIR$ ATTRIBUTES VECTOR:( PROCESSOR(atom),
!DIR$& SCALAR(a, b), 
!DIR$& LINEAR(x:1, y:1) ) :: f

The following example for Intel® 64 architecture targeting the Intel® Xeon Phi™ coprocessor (code name Knights Landing) tells the compiler to generate vector functions for that coprocessor:

!DIR$ ATTRIBUTES VECTOR : PROCESSOR(mic_avx512) :: f
integer function f(a, b)
integer :: a, b
f = a + b
end function f

See Also