Intel® C++ Compiler 16.0 User and Reference Guide
Performs a swizzle transformation on a 64-bit floating point vector. There is no corresponding instruction for this intrinsic. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).
Without Mask extern _m512d __cdecl _mm512_swizzle_pd(_m512d v, _MM_SWIZZLE_ENUM s) |
With Mask extern __m512d __cdecl _mm512_mask_swizzle_pd(__m512d v1_old, __mmask8 k1, __m512d v, _MM_SWIZZLE_ENUM s); |
Performs a swizzle transformation. It is used to permute elements in each 4-element lane of a 64-bit vector according to the swizzle parameter, s. The swizzle parameter specifies the order of elements in each lane. The DCBA swizzle parameter specifies no reordering. For example, if the elements of a source vectors are hgfe dcba, such as v[0]=a, v[1]=b, v[2]=c, …, v[7]=f then the result of y=_mm512_swizzle_pd(v, _MM_SWIZ_REG_BADC) is a vector with all elements permuted, as follows: nmpo jilk fehg badc (y[0]=c, y[1]=d, y[2]=a, …, y[7]=f).
If the swizzle parameter is, for example, _MM_SWIZ_REG_BBBB then the resulting vector for the input vector from previous example is ffff bbbb.
The masked variant has two additional arguments : v1_old and k1. Only those elements in the source vector v with the corresponding bit set in vector mask k1 are used for computing. Those elements of vector v with the corresponding bit clear in vector mask k1 are not used in the computation. Instead, the corresponding element from v1_old is copied to the resulting vector.
_mm512_add_pd (v2, _mm512_swizzle_pd(v3, _MM_SWIZ_REG_DCBA));
Returns the result of the swizzle operation on a float64 vector.