Intel® C++ Compiler 16.0 User and Reference Guide
This topic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).
Most intrinsic functions map directly to Intel® Initial Many Core Instructions (Intel® IMCI). The Intel® IMCI are extensions of the existing Intel® 64 architecture based vector graphic streaming SIMD instructions.
The exclusive features of the intrinsic functions (and corresponding instructions) are:
The intrinsics/instructions operate in the same memory address space as the standard Intel® 64 instructions.
They use special registers that enable packed data of up to 512 bits in length for optimal vector graphic streaming SIMD processing.
The native Data Types enable packing of up to 16 elements of data of 32-bits (float/integer) in one register.
For computational and data manipulation instructions, there are two corresponding C intrinsics that implement that instruction directly.
Vector mask support is provided through eight vector mask registers allow conditional execution over the 16 elements in a vector instruction and the results merge into the original destination.
Most functions are ternary: two sources with a different destination or three sources with one source as destination.
The intrinsics/instructions operate upon special registers referred to as vectors (v). A vector is defined as a sequence of packed data elements. intrinsics/instructions use at least one vector.
There are 32 vectors (v0 to v31), each of 512-bits in size; they pack sixteen elements of 32 bits, or eight elements of 64 bits.
In addition to the SIMD processing capability, special instructions such as vgather/vscatter provide support to manipulate irregular data patterns of memory, enabling vectorization of algorithms with complex data structures.