Intel® Advisor
provides two design tools to help you take full performance advantage of
today's processors:
- Threading Advisor is a
threading design and prototyping tool that lets you analyze, design, tune, and
check threading design options without disrupting your normal development.
- Vectorization Advisor is a
vectorization design tool that lets you identify loops that will benefit most
from vectorization, identify what is blocking vectorization, forecast the
benefit of alternative data reorganizations, and increase the confidence that
vectorization is safe.
The following is a glossary for the Vectorization Advisor. It is a
work in progress.
- call count: The
number of times a loop is invoked.
- directive: A
programming language construct that specifies how a compiler should process
input. Same as a C/C++ pragma.
-
filling: Moving a variable from main memory to a register.
Using variables in registers instead of main memory results in better
performance.
-
ICC: Command line for invoking the Intel® C Compiler on the
Linux* platform. Often used as a shorthand for referring to the compiler.
-
ICL: Command line for invoking the Intel® C/C++ Compiler on
the Microsoft Windows* platform. Often used as a shorthand for referring to the
compiler.
-
ICPC: Command line for invoking the Intel® C++ Compiler on the
Linux* platform. Often used as a shorthand for referring to the compiler.
- IFORT: Command line
for invoking the Intel® Fortran Compiler on the Windows* and Linux* platforms.
Often used as a shorthand for referring to the compiler.
-
loop body: A vectorized loop (usually) compiler-generated from
a source loop. Same as kernel loop.
-
peeled loop: A small, (usually) compiler-generated loop
created to align the memory accesses inside the loop body and maximize its
efficiency. The compiler peels off any initial iterations containing misaligned
accesses, which leaves the remaining iterations' memory accesses optimally
aligned. A peeled loop always has a trip count smaller than the vector length.
-
register pressure: When the optimal number of registers is
unavailable for variable allocation. High register pressure may result in
spilling.
- remainder loop: A
(usually) compiler-generated loop created to clean up any remaining iterations
that do not fit within the scope of the loop body. The compiler typically
generates remainder loops when the source loop trip count is not a multiple of
the vector length.
-
SIMD: Single-instruction-multiple-data. A processor
instruction that performs the same operation on multiple pieces of data (such
as elements of an array).
- source loop: A
developer-written loop as it appears in source code.
- spilling: Moving a
variable from a register to main memory. A spilled variable must be loaded in
and out of main memory for every read/write operation, resulting in poorer
performance.
-
trip count: The number of times the body of a loop will
execute. Same as iteration count (and sometimes referred to as loop count in
Intel compiler documentation).
-
unroll: Optimize a loop by duplicating its body, thus reducing
the branching overhead and the number of loop iterations that must execute. A
complete unroll fully duplicates the loop body such that no repetition is
required. A partial unroll of size n duplicates the body n times and reduces
the number of iterations to 1/n of the original iteration count.
-
vector length: Number of elements that can be processed in the
same operation. Ideal vector length = vector register width in bits / data type
size in bits.
- vector register
width: The number of bits in the processor vector registers. Intel®
Streaming SIMD Extensions 2 (Intel® SSE2) instructions operate on 128-bit
registers; Intel® Advanced Vector Extensions (Intel® AVX) instructions operate
on 256-bit registers; Intel® Many Integrated Core Instructions (Intel® MIC
Instructions) operate on 512-bit registers.
- vectorize: Generate
code that takes advantage of processor vectorization hardware, usually by
executing SIMD instructions.