Intel® C++ Compiler 16.0 User and Reference Guide

Details about Intel® Streaming SIMD Extensions Intrinsics

Intel® Streaming SIMD Extensions (Intel® SSE) instructions use the following features:

Registers

Intel® Streaming SIMD Extensions use eight 128-bit registers (XMM0 to XMM7).

Because each of these registers can hold more than one data element, the processor can process more than one data element simultaneously. This processing capability is also known as single-instruction multiple data processing (SIMD).

For each computational and data manipulation instruction in the new extension sets, there is a corresponding C intrinsic that implements that instruction directly. This frees you from managing registers and assembly programming. Further, the compiler optimizes the instruction scheduling so that your executable runs faster.

Note

The MM and XMM registers are the SIMD registers used by the systems based on IA-32 architecture to implement MMX™ technology and Intel® SSE or Intel® Streaming SIMD Extensions 2 (Intel® SSE2).

Data Types

These intrinsic functions use four new C data types as operands, representing the new registers that are used as the operands to these intrinsic functions.

New Data Types

The following table details for which instructions each of the new data types are available.

New Data Type

Intel® Streaming SIMD Extensions Intrinsics

Intel® Streaming SIMD Extensions 2 Intrinsics

Intel® Streaming SIMD Extensions 3 Intrinsics

__m64

Available

Available

Available

__m128

Available

Available

Available

__m128d

Not available

Available

Available

__m128i

Not available

Available

Available

__m128 Data Types

The __m128 data type is used to represent the contents of a Intel® SSE register used by Intel® SSE intrinsics. The __m128 data type can hold four 32-bit floating-point values.

The __m128d data type can hold two 64-bit floating-point values.

The __m128i data type can hold sixteen 8-bit, eight 16-bit, four 32-bit, or two 64-bit integer values.

The compiler aligns __m128d and __m128ilocal and global data to 16-byte boundaries on the stack. To align integer, float, or double arrays, you can use the __declspec(align) statement.

Data Types Usage Guidelines

These data types are not basic ANSI C data types. You must observe the following usage restrictions:

Accessing __m128i Data

To access 8-bit data:

#define  _mm_extract_epi8(x, imm) \ 
((((imm) & 0x1) == 0) ?   \
 _mm_extract_epi16((x), (imm) >> 1) & 0xff : \
 _mm_extract_epi16( _mm_srli_epi16((x), 8), (imm) >> 1))

For 16-bit data, use the following intrinsic:

int  _mm_extract_epi16(__m128i a, int imm)

To access 32-bit data:

#define  _mm_extract_epi32(x, imm) \
 _mm_cvtsi128_si32( _mm_srli_si128((x), 4 * (imm)))

See Also