Issue: Scalar math function call(s) present

Math functions in the loop body are preventing the compiler from effectively vectorizing the loop. Improve performance by enabling vectorized math call(s).

Recommendation: Enable inline expansion

Inlining is disabled by compiler option. To fix: When using the Ob or inline-level compiler option to control inline expansion, replace the 0 argument with the 1 argument to enable inlining when an inline keyword or attribute is specified or the 2 argument to enable inlining of any function at compiler discretion.

Windows* OS	Linux* OS
/Ob1 or /Ob2	-inline-level=1 or -inline-level=2

Alternatively use #include <mathimf.h> header instead of the standard #include <math.h> header to call highly optimized and accurate mathematical functions commonly used in applications that rely heaving on floating point computations.

Read More:

Vectorization Resources for Intel® Advisor Users

Recommendation: Use the Intel short vector math library for vector intrinsics

Your application calls scalar instead of vectorized versions of math functions. To fix: Do all of the following:

Use the -mveclibabi=svml compiler option to specify the Intel short vector math library ABI type for vector instrinsics.
Use the -ftree-vectorize and -funsafe-math-optimizations compiler options to enable vector math functions.
Use the -L/path/to/intel/lib and -lsvml compiler options to specify an SVML ABI-compatible library at link time.

Example:

gcc program.c -O2 -ftree-vectorize -funsafe-math-optimizations -mveclibabi=svml -L/opt/intel/lib/intel64 -lm -lsvml -Wl,-rpath=/opt/intel/lib/intel64

#include "math.h"
#include "stdio.h"
#define N 100000

int main()
{
   double angles[N], results[N];
   int i;
   srand(86456);

   for (i = 0; i < N; i++)
   {
      angles[i] = rand();
   }

   // the loop will be auto-vectorized
   for (i = 0; i < N; i++)
   {
      results[i] = cos(angles[i]);
   }

   return 0;
}

Read More:

Recommendation: Use a Glibc library with vectorized SVML functions

Your application calls scalar instead of vectorized versions of math functions. To fix: Do all of the following:

Upgrade the Glibc library to version 2.22 or higher. It supports SIMD directives in OpenMP* 4.0 or higher.
Upgrade the GNU* gcc compiler to version 4.9 or higher. It supports vectorized math function options.
Use the -fopenmp and -ffast-math compiler options to enable vector math functions.
Use appropriate OpenMP SIMD directives to enable vectorization.

Note : Also use the -I/path/to/glibc/install/include and -L/path/to/glibc/install/lib compiler options if you have multiple Glibc libraries installed on the host.

Example:

gcc program.c -O2 -fopenmp -ffast-math -lrt -lm -mavx2 -I/opt/glibc-2.22/include -L/opt/glibc-2.22/lib -Wl,--dynamic-linker=/opt/glibc-2.22/lib/ld-linux-x86-64.so.2

#include "math.h"
#include "stdio.h"
#define N 100000

int main()
{
   double angles[N], results[N];
   int i;
   srand(86456);

   for (i = 0; i < N; i++)
   {
      angles[i] = rand();
   }

   #pragma omp simd
   for (i = 0; i < N; i++)
   {
      results[i] = cos(angles[i]);
   }

   return 0;
}

Read More:

Recommendation: Vectorize math function calls inside loops

Your application calls serialized versions of math functions when you use the precise floating point model. To fix: Do one of the following:

Add fast-transcendentals compiler option to replace calls to transcendental functions with faster calls.

Windows* OS Linux* OS

/Qfast-transcendentals -fast-transcendentals

CAUTION: This may reduce floating point accuracy.
Enforce vectorization of the source loop using a directive: #pragma simd or #pragma omp simd

Windows* OS	Linux* OS
/Qfast-transcendentals	-fast-transcendentals

Example:

void add_floats(float *a, float *b, float *c, float *d, float *e, int n)
{
  int i; 
  #pragma omp simd
  for (i=0; i<n; i++)
  {
    a[i] = a[i] + b[i] + c[i] + d[i] + e[i];
  } 
}

Read More:

Getting Started with Intel Compiler Pragmas and Directives and Vectorization Resources for Intel® Advisor Users

Recommendation: Change the floating point model

Your application calls serialized versions of math functions when you use the strict floating point model. To fix: Do one of the following:

Use the fast floating point model to enable more aggressive optimizations or the precise floating point model to disable optimizations that are not value-safe on fast transcendental functions.

Windows* OS	Linux* OS
/fp:fast	-fp-model fast
/fp:precise /Qfast-transcendentals	-fp-model precise -fast-transcendentals

CAUTION: This may reduce floating point accuracy.

Use the precise floating point model and enforce vectorization of the source loop using a directive: #pragma simd or #pragma omp simd

Example:

gcc program.c -O2 -fopenmp -fp-model precise -fast-transcendentals

#pragma omp simd collapse(2)
for(i=0; i<N; i++) 
{
  a[i] = b[i] * c[i];
  for(i=0; i<N; i++) 
  { 
    d[i] = e[i] * f[i]; 
  }
}

Read More:

Getting Started with Intel Compiler Pragmas and Directives and Vectorization Resources for Intel® Advisor Users