Intel® Fortran Compiler 16.0 User and Reference Guide

User-mandated or SIMD Vectorization

User-mandated or SIMD vectorization supplements automatic vectorization just like OpenMP* parallelization supplements automatic parallelization. The following figure illustrates this relationship. User-mandated vectorization is implemented as a single-instruction-multiple-data (SIMD) feature and is referred to as SIMD vectorization.

Note

The SIMD vectorization feature is available for both Intel® microprocessors and non-Intel microprocessors. Vectorization may call library routines that can result in additional performance gain on Intel® microprocessors than on non-Intel microprocessors. The vectorization can also be affected by certain options, such as /arch (Windows*), -m (Linux* and OS X*), or [Q]x.

The following figure illustrates how SIMD vectorization is positioned among various approaches that you can take to generate vector code that exploits vector hardware capabilities. The programs written with SIMD vectorization are very similar to those written using auto-vectorization hints. You can use SIMD vectorization to minimize the amount of code changes that you may have to go through in order to obtain vectorized code.

SIMD vectorization uses the !DIR$ SIMD directive to effect loop vectorization. You must add this directive to a loop and recompile for the loop to get vectorized (the option [Q]simd is enabled by default).

Consider an example in Fortran where the compiler does not automatically vectorize the loop due to the unknown data dependence distance "X". You can use the data dependence assertion via the auto-vectorization hint !DIR$ IVDEP, to let the compiler decide to vectorize the loop or not, or you can enforce vectorization of the loop using !DIR$ SIMD.

Example: without !DIR$ SIMD

[D:/simd] cat example1.f 
subroutine add(A, N, X) 
integer N, X 
real    A(N) 
DO I=X+1, N
  A(I) = A(I) + A(I-X) 
ENDDO 
end
[D:/simd] ifort example1.f -nologo -Qvec-report2 
  D:\simd\example1.f(6): (col. 9) remark: loop was not vectorized: existence of vector dependence.

Example: with !DIR$ SIMD

[D:/simd] cat example1.f 
subroutine add(A, N, X) 
integer N, X 
real    A(N) 
!DIR$ SIMD 
DO I=X+1, N
  A(I) = A(I) + A(I-X) 
ENDDO 
end
[D:/simd] ifort example1.f -nologo -Qvec-report2 -Qsimd 
  D:\simd\example1.f(7): (col. 9) remark: LOOP WAS VECTORIZED.

The one big difference between using the SIMD directive and auto-vectorization hints is that with the SIMD directive, the compiler generates a warning when it is unable to vectorize the loop. With auto-vectorization hints, actual vectorization is still under the discretion of the compiler, even when you use the !DIR$ VECTOR ALWAYS hint.

The SIMD directive has optional clauses to guide the compiler on how vectorization must proceed. Use these clauses appropriately so that the compiler obtains enough information to generate correct vector code. For more information on the clauses, see the !DIR$ SIMD description.

Additional Semantics

Note the following points when using !DIR$ SIMD directive.

Using vector Declaration

Consider the following Intel® Visual Fortran example code for a program to compare serial and vector computations using a user-defined function, foo().

Note

All code examples in this section are applicable for Fortran on Windows* only.

Example: Where user-defined function is not vectorized

!! file simdmain.f90 
program simdtest 
! Test vector function in external file.
 implicit none
 interface
   integer function foo(a, b)
   integer a, b
   end function foo
 end interface
 
 integer, parameter :: M = 48, N = 64
 
  integer  i, j
  integer, dimension(M,N) :: a1  
  integer, dimension(M,N) :: a2
  integer, dimension(M,N) :: s_a3
  integer, dimension(M,N) :: v_a3 
logical :: err_flag = .false.
 
! compute random numbers for arrays
 do j = 1, N
  do i = 1, M
   a1(i,j) = rand() * M
   a2(i,j) = rand() * M
  end do
 end do
 
 ! compute serial results
 do j = 1, N 
!dir$ novector 
  do i = 1, M
   s_a3(i,j) = foo(a1(i,j), a2(i,j))
  end do
 end do
 
 ! compute vector results
  do j = 1, N 
   do i = 1, M
    v_a3(i,j) = foo(a1(i,j), a2(i,j))
   end do
  end do
 
 ! compare serial and vector results
 do j = 1, N 
  do i = 1, M
   if (s_a3(i,j) .ne. v_a3(i,j)) then
    err_flag = .true. 
    print *, s_a3(i, j), v_a3(i,j)
   end if
  end do 
 end do
 if (err_flag .eq. .true.) then
  write(*,*) "FAILED"
   else
  write(*,*) "PASSED"
 end if 
end program
 
!! file: vecfoo.f90 
integer function foo(a, b)
 implicit none
 integer, intent(in) :: a, b
  foo = a - b 
end function 
[49 C:/temp] ifort -Qvec-report simdmain.f90 vecfoo.f90 simdmain.f90 vecfoo.f90 
  C:\temp\simdmain.f90(3): (col. 3) remark: loop was not vectorized: existence of vector dependence. 
  C:\temp\vecfoo.f90(3): (col. 3) remark: function was not vectorized.

When you compile the above code, the loop containing the foo() function is not auto-vectorized because the auto-vectorizer does not know what foo() does unless it is inlined to this call site.

In such cases where the function call is not inlined, you can use the !DIR$ attributes vector::function-name-list declaration to vectorize the loop and the function foo(). All you need to do is add the vector declaration to the function declaration, and recompile the code. The loop and function are vectorized.

Example: Where loop with user-defined function with vector declaration is auto-vectorized

!! file simdmain.f90 
program simdtest 
! Test vector function in external file.
 implicit none
 interface
   integer function foo(a, b) 
!dir$ attributes vector :: foo
   integer a, b
   end function foo
 end interface
 
 integer, parameter :: M = 48, N = 64
 
  integer  i, j
  integer, dimension(M,N) :: a1  
  integer, dimension(M,N) :: a2
  integer, dimension(M,N) :: s_a3
  integer, dimension(M,N) :: v_a3 
logical :: err_flag = .false.
 
! compute random numbers for arrays
 do j = 1, N
  do i = 1, M
   a1(i,j) = rand() * M
   a2(i,j) = rand() * M
  end do
 end do
 
 ! compute serial results
 do j = 1, N 
!dir$ novector 
  do i = 1, M
   s_a3(i,j) = foo(a1(i,j), a2(i,j))
  end do
 end do
 
 ! compute vector results
  do j = 1, N 
   do i = 1, M
    v_a3(i,j) = foo(a1(i,j), a2(i,j))
   end do
  end do
 
 ! compare serial and vector results
 do j = 1, N 
  do i = 1, M
   if (s_a3(i,j) .ne. v_a3(i,j)) then
    err_flag = .true. 
    print *, s_a3(i, j), v_a3(i,j)
   end if
  end do 
 end do
 if (err_flag .eq. .true.) then
  write(*,*) "FAILED"
   else
  write(*,*) "PASSED"
 end if 
end program
 
!! file: vecfoo.f90 
integer function foo(a, b) 
!dir$ attributes vector :: foo
 implicit none
 integer, intent(in) :: a, b
  foo = a - b 
end function 
[49 C:/temp] ifort -Qvec-report simdmain.f90 vecfoo.f90 simdmain.f90 vecfoo.f90 
  C:\temp\simdmain.f90(3): (col. 3) remark: LOOP WAS VECTORIZED. 
  C:\temp\vecfoo.f90(3): (col. 3) remark: FUNCTION WAS VECTORIZED.

Restrictions on Using vector declaration

Vectorization depends on two major factors: hardware and the style of source code. When using the vector declaration, the following features are not allowed:

Formal parameters must be of the following data types:

See Also