MMX™ Technology Packed Arithmetic Intrinsics

The prototypes for MMX™ technology packed arithmetic intrinsics are in the mmintrin.h header file.

Intrinsic Name	Operation	Corresponding MMX™ Instruction
_mm_add_pi8	Addition	PADDB
_mm_add_pi16	Addition	PADDW
_mm_add_pi32	Addition	PADDD
_mm_adds_pi8	Addition	PADDSB
_mm_adds_pi16	Addition	PADDSW
_mm_adds_pu8	Addition	PADDUSB
_mm_adds_pu16	Addition	PADDUSW
_mm_sub_pi8	Subtraction	PSUBB
_mm_sub_pi16	Subtraction	PSUBW
_mm_sub_pi32	Subtraction	PSUBD
_mm_subs_pi8	Subtraction	PSUBSB
_mm_subs_pi16	Subtraction	PSUBSW
_mm_subs_pu8	Subtraction	PSUBUSB
_mm_subs_pu16	Subtraction	PSUBUSW
_mm_madd_pi16	Multiply and add	PMADDWD
_mm_mulhi_pi16	Multiplication	PMULHW
_mm_mullo_pi16	Multiplication	PMULLW

_mm_add_pi8

__m64 _mm_add_pi8(__m64 m1, __m64 m2);

Add the eight 8-bit values in m1 to the eight 8-bit values in m2.

_mm_add_pi16

__m64 _mm_add_pi16(__m64 m1, __m64 m2);

Add the four 16-bit values in m1 to the four 16-bit values in m2.

_mm_add_pi32

__m64 _mm_add_pi32(__m64 m1, __m64 m2);

Add the two 32-bit values in m1 to the two 32-bit values in m2.

_mm_adds_pi8

__m64 _mm_adds_pi8(__m64 m1, __m64 m2);

Add the eight signed 8-bit values in m1 to the eight signed 8-bit values in m2 using saturating arithmetic.

_mm_adds_pi16

__m64 _mm_adds_pi16(__m64 m1, __m64 m2);

Add the four signed 16-bit values in m1 to the four signed 16-bit values in m2 using saturating arithmetic.

_mm_adds_pu8

__m64 _mm_adds_pu8(__m64 m1, __m64 m2);

Add the eight unsigned 8-bit values in m1 to the eight unsigned 8-bit values in m2 and using saturating arithmetic.

_mm_adds_pu16

__m64 _mm_adds_pu16(__m64 m1, __m64 m2);

Add the four unsigned 16-bit values in m1 to the four unsigned 16-bit values in m2 using saturating arithmetic.

_mm_sub_pi8

__m64 _mm_sub_pi8(__m64 m1, __m64 m2);

Subtract the eight 8-bit values in m2 from the eight 8-bit values in m1.

_mm_sub_pi16

__m64 _mm_sub_pi16(__m64 m1, __m64 m2);

Subtract the four 16-bit values in m2 from the four 16-bit values in m1.

_mm_sub_pi32

__m64 _mm_sub_pi32(__m64 m1, __m64 m2);

Subtract the two 32-bit values in m2 from the two 32-bit values in m1.

_mm_subs_pi8

__m64 _mm_subs_pi8(__m64 m1, __m64 m2);

Subtract the eight signed 8-bit values in m2 from the eight signed 8-bit values in m1 using saturating arithmetic.

_mm_subs_pi16

__m64 _mm_subs_pi16(__m64 m1, __m64 m2);

Subtract the four signed 16-bit values in m2 from the four signed 16-bit values in m1 using saturating arithmetic.

_mm_subs_pu8

__m64 _mm_subs_pu8(__m64 m1, __m64 m2);

Subtract the eight unsigned 8-bit values in m2 from the eight unsigned 8-bit values in m1 using saturating arithmetic.

_mm_subs_pu16

__m64 _mm_subs_pu16(__m64 m1, __m64 m2);

Subtract the four unsigned 16-bit values in m2 from the four unsigned 16-bit values in m1 using saturating arithmetic.

_mm_madd_pi16

__m64 _mm_madd_pi16(__m64 m1, __m64 m2);

Multiply four 16-bit values in m1 by four 16-bit values in m2 producing four 32-bit intermediate results, which are then summed by pairs to produce two 32-bit results.

_mm_mulhi_pi16

__m64 _mm_mulhi_pi16(__m64 m1, __m64 m2);

Multiply four signed 16-bit values in m1 by four signed 16-bit values in m2 and produce the high 16 bits of the four results.

_mm_mullo_pi16

__m64 _mm_mullo_pi16(__m64 m1, __m64 m2);

Multiply four 16-bit values in m1 by four 16-bit values in m2 and produce the low 16 bits of the four results.