Single-precision Floating-point Vector Intrinsics

The single-precision floating-point vector intrinsics listed here are designed for the Intel® Pentium® 4 processor with Streaming SIMD Extensions 3 (Intel® SSE3).

The results of each intrinsic operation are placed in the registers R0, R1, R2, and R3.

The prototypes for these intrinsics are in the pmmintrin.h header file.

Intrinsic Name	Operation	Corresponding Intel® SSE3 Instruction
_mm_addsub_ps	Subtract and add	`ADDSUBPS`
_mm_hadd_ps	Add	`HADDPS`
_mm_hsub_ps	Subtracts	`HSUBPS`
_mm_movehdup_ps	Duplicates	`MOVSHDUP`
_mm_moveldup_ps	Duplicates	`MOVSLDUP`

_mm_addsub_ps

extern __m128 _mm_addsub_ps(__m128 a, __m128 b);

Subtracts even vector elements while adding odd vector elements.

R0	R1	R2	R3
a0 - b0;	a1 + b1;	a2 - b2;	a3 + b3;

_mm_hadd_ps

extern __m128 _mm_hadd_ps(__m128 a, __m128 b);

Adds adjacent vector elements.

R0	R1	R2	R3
a0 + a1;	a2 + a3;	b0 + b1;	b2 + b3;

_mm_hsub_ps

extern __m128 _mm_hsub_ps(__m128 a, __m128 b);

Subtracts adjacent vector elements.

R0	R1	R2	R3
a0 - a1;	a2 - a3;	b0 - b1;	b2 - b3;

_mm_movehdup_ps

extern __m128 _mm_movehdup_ps(__m128 a);

Duplicates odd vector elements into even vector elements.

R0	R1	R2	R3
a1;	a1;	a3;	a3;

_mm_moveldup_ps

extern __m128 _mm_moveldup_ps(__m128 a);

Duplicates even vector elements into odd vector elements.

R0	R1	R2	R3
a0;	a0;	a2;	a2;