Intel® C++ Compiler 16.0 User and Reference Guide
Loads high 64-byte-aligned portion of unaligned doubleword stream, unpacks mask-enabled elements that fall in that portion, and stores those elements in int32 vector. Corresponding instruction is VLOADUNPACKHD. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).
Without Mask extern __m512i __cdecl _mm512_extloadunpackhi_epi32(__m512i v1_old, void const* mt, _MM_UPCONV_EPI32_ENUM conv, int hint); |
With Mask extern __m512i __cdecl _mm512_mask_extloadunpackhi_epi32(__m512i v1_old, __mmask16 k1, void const* mt, _MM_UPCONV_EPI32_ENUM conv, int hint); |
The high-64-byte-aligned portion of the byte/word/doubleword stream starting at the element-aligned address (mt − 64) is loaded, converted, and expanded into the writemask-enabled elements of resulting doubleword vector, for which the initial values are copied from v1_old vector. The number of set bits in the writemask determines the length of the converted doubleword stream, as each doubleword is mapped to exactly one of the doubleword elements in the resulting vector, skipping over writemasked elements of the resulting vector.
This function only transfers those converted doublewords (if any) in the stream that occur at or after the first 64-byte-aligned address following (mt − 64) (that is, in the high cache line of the memory stream for the current implementation). Elements in the resulting vector that do not map to those stream doublewords are left unchanged (taken from v1_old).
Returns the result of the load operation.