Intel® C++ Compiler 16.0 User and Reference Guide

_mm512_mulhi_epu32/ _mm512_mask_mulhi_epu32

Multiply uint32 vectors and store the high half of the result. The corresponding instruction is VMULHPU. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).

Syntax

Without Mask

extern _m512i __cdecl _mm512_mulhi_epu32(_m512i v2, _m512i v3);

With Mask

extern _m512i __cdecl _mm512_mask_mulhi_epu32(_m512i v1_old, __mmask16 k1, _m512i v2, _m512i v3);

Parameters

v2

uint32 vector multiplied to uint32 vector v3

v3

uint32 vector multiplied to uint32 vector v2; can contain the result of a swizzle/broadcast/conversion process on a memory or uint32 vector

v1_old

Source vector that retains old values of the destination vector; the resulting vector gets corresponding elements from v1_old for zero mask bits

k1

Writemask; only those elements of the source vectors with corresponding bit set to '1' in the k1 mask are computed and stored in the result; elements in the result vector corresponding to zero bit in k1 are copied from corresponding elements of vector v1_old

Description

Performs an element-by-element multiplication between uint32 vector v2 and the uint32 vector v3. The high 32 bits of the result are written into uint32 result vector.

The masked variant has one additional argument: k1. Only those elements in source registers with the corresponding bit set in vector mask k1 are used for computing. The remaining elements of the resulting vector are filled with corresponding elements from v1_old.

Returns

Returns the result of the multiplication operation in an uint32 result vector.