Intel® C++ Compiler 16.0 User and Reference Guide

_mm512_permutevar_epi32/ _mm512_mask_permutevar_epi32

Permutes 32-bit blocks of an int32 vector. The corresponding instruction is VPERMD. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).

Syntax

Without Mask

extern __m512i __cdecl _mm512_permutevar_epi32(__m512i v2, __m512i v3);

With Mask

extern __m512i __cdecl _mm512_mask_permutevar_epi32(__m512i v1_old, __mmask16 k1, __m512i v2, __m512i v3);

Parameters

v1_old Source vector that retains the old values of the destination vector. The resulting vector gets the corresponding elements from v1_old for zero mask bits.
v2 An int32 vector containing indices for permutation.
v3 A source int32 vector.
k1 A writemask. Only those elements of the source vectors with corresponding bit set to one in the k1 mask are computed and stored in the result. The elements in the result vector corresponding to the zero bit in k1 are copied from corresponding elements of vector v1_old.

Description

Permutes 32-bit blocks of int32 vector v3 according to indices in the int32 vector v2. The ith element of the result is the jth element of v3, where j is the ith element of v2.

The resulting vector for the masked variant is populated by elements for which the corresponding bit in the writemask vector k1 is set. The remaining elements of the resulting vector for the masked variant are populated by corresponding elements from v1_old.

The non-masked variant of the intrinsic is equivalent to the masked variant with full mask (k1=0xffff).

Returns

Returns the result of the permutation.