Intel® Fortran Compiler 16.0 User and Reference Guide

qopt-gather-scatter-unroll, Qopt-gather-scatter-unroll

Lets you specify an alternative loop unroll sequence for gather and scatter loops. Option -qopt-gather-scatter-unroll is the replacement option for -opt-gather-scatter-unroll, which is deprecated.

Architecture Restrictions

Only available on Intel® 64 architecture targeting the Intel® Xeon Phi™ coprocessor x100 product family (formerly code name Knights Corner)

Syntax

Linux:

-qopt-gather-scatter-unroll=n

-qno-opt-gather-scatter-unroll

OS X:

None

Windows:

/Qopt-gather-scatter-unroll:n

/Qopt-gather-scatter-unroll-

Arguments

n

Is the unroll factor for the gather and scatter loops. It must be an integer between 0 and 8. If you specify value 0 for n, it is the same as specifying the negative form of the option.

Default

-qno-opt-gather-scatter-unroll or /Qopt-gather-scatter-unroll

The compiler uses default heuristics when unrolling gather and scatter loops.

Description

This option lets you specify an alternative loop unroll sequence for gather and scatter loops.

This option may improve performance of gather/scatter operations.

The value of n that provides the best performance is data-dependent.

In cases where the gather/scatter operation accesses data in a small number of cache-lines (such as 1 or 2), the default sequence (using a small value for n) works best. In cases where each individual data item falls in a different cache-line, it may be better to use a large value for n.

IDE Equivalent

None

Alternate Options

None

Example

Normally, there are no "one-shot" gather/scatter instructions, so the compiler generates a loop to perform complete gather/scatter. By default, the loop looks as follows:

L1:
   gather
   jkz L2
   gather
   jknz L1
L2:

For some applications, this loop would be faster if it was unrolled; and different applications may benefit from different unroll factors. Also, when the loop is unrolled, adding gather/scatter hint instructions before the loop provides additional benefits.

If you specify option [q or Q]opt-gather-scatter-unroll, the compiler will generate a similar loop unrolled by the number specified in n.

The following example shows what happens when the -qopt-gather-scatter-unroll=3 (Linux*) or /Qopt-gather-scatter-unroll:3 option (Windows*) is specified. Notice that the alternate sequence also generates two gather/scatter hint instructions preceding the loop:

   gather hint
   gather hint
   nop
L1:
   gather
   jkz L2
   gather |
   gather |  -> the number of gathers specified by n
   gather |
   jknz L1
L2: