Intel® C++ Compiler 16.0 User and Reference Guide
Performs a swizzle transformation on a 32-bit integer vector. There is no corresponding instruction for this intrinsic. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).
Without Mask extern __m512i __cdecl _mm512_swizzle_epi32(__m512i v, _MM_SWIZZLE_ENUM s); |
With Mask extern __m512i __cdecl _mm512_mask_swizzle_epi32(__m512i v1_old, __mmask16 k1, __m512i v, _MM_SWIZZLE_ENUM s); |
Performs a swizzle transformation. It is used to permute elements in each 4-element lane of a 32-bit vector according to the swizzle parameter, s. The swizzle parameter specifies the order of elements in each lane. The DCBA swizzle parameter specifies no reordering. For example, if the elements of a source vectors are ponm lkji hgfe dcba, such as v[0]=a, v[1]=b, v[2]=c, …, v[15]=p then the result of y=_mm512_swizzle_epi32(v, _MM_SWIZ_REG_BADC) is a vector with all elements permuted, as follows: nmpo jilk fehg badc (y[0]=c, y[1]=d, y[2]=a, …, y[15]=n).
If the swizzle parameter is, for example, _MM_SWIZ_REG_BBBB then the resulting vector for the input vector from previous example is nnnn jjjj ffff bbbb.
The masked variant has two additional arguments : v1_old and k1. Only those elements in the source vector v with the corresponding bit set in vector mask k1 are used for computing. Those elements of vector v with the corresponding bit clear in vector mask k1 are not used in the computation. Instead, the corresponding element from v1_old is copied to the resulting vector.
_mm512_add_epi32(v2, _mm512_swizzle_epi32(v3, _MM_SWIZ_ REG _DACB));
Returns the result of the swizzle operation on an int32 vector.