Intel® C++ Compiler 16.0 User and Reference Guide
Enables or disables loop unrolling and jamming. These pragmas can only be applied to iterative for loops.
#pragma unroll_and_jam #pragma unroll_and_jam (n) #pragma nounroll_and_jam |
n |
The unrolling factor representing the number of times to unroll a loop; it must be an integer constant from 0 through 255 |
The unroll_and_jam pragma partially unrolls one or more loops higher in the nest than the innermost loop and fuses/jams the resulting loops back together. This transformation allows more reuses in the loop.
This pragma is not effective on innermost loops. Ensure that the immediately following loop is not the innermost loop after compiler-initiated interchanges are completed.
Specifying this pragma is a hint to the compiler that the unroll and jam sequence is legal and profitable. The compiler enables this transformation whenever possible.
The unroll_and_jam pragma must precede the for statement for each for loop it affects. If n is specified, the optimizer unrolls the loop n times. If n is omitted or if it is outside the allowed range, the optimizer assigns the number of times to unroll the loop. The compiler generates correct code by comparing n and the loop count.
This pragma is supported only when compiler option O3 is set. The unroll_and_jam pragma overrides any setting of loop unrolling from the command line.
When unrolling a loop increases register pressure and code size it may be necessary to prevent unrolling of a nested loop or an imperfect nested loop. In such cases, use the nounroll_and_jam pragma. The nounroll_and_jam pragma hints to the compiler not to unroll a specified loop.
Example: Using the unroll_and_jam pragma |
---|
int a[10][10]; int b[10][10]; int c[10][10]; int d[10][10]; void unroll(int n) { int i,j,k; #pragma unroll_and_jam (6) for (i = 1; i < n; i++) { #pragma unroll_and_jam (6) for (j = 1; j < n; j++) { for (k = 1; k < n; k++){ a[i][j] += b[i][k]*c[k][j]; } } } } |