Intel® C++ Compiler 16.0 User and Reference Guide

_mm512_swizzle_epi32/ _mm512_mask_swizzle_epi32

Performs a swizzle transformation on a 32-bit integer vector. There is no corresponding instruction for this intrinsic. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).

Syntax

Without Mask

extern __m512i __cdecl _mm512_swizzle_epi32(__m512i v, _MM_SWIZZLE_ENUM s);

With Mask

extern __m512i __cdecl _mm512_mask_swizzle_epi32(__m512i v1_old, __mmask16 k1, __m512i v, _MM_SWIZZLE_ENUM s);

Parameters

v

Source vector for swizzle process

s

Swizzle parameter; may take one of the following values:

  • _MM_SWIZ_REG_NONE - hgfe dcba - No operation
  • _MM_SWIZ_REG_DCBA - hgfe dcba - No operation
  • _MM_SWIZ_REG_CDAB - ghef cdab - Swap pairs
  • _MM_SWIZ_REG_BADC - fehg badc - Swap with two-away
  • _MM_SWIZ_REG_AAAA - eeee aaaa - Broadcast a element
  • _MM_SWIZ_REG_BBBB - ffff bbbb - Broadcast b element
  • _MM_SWIZ_REG_CCCC - gggg cccc - Broadcast c element
  • _MM_SWIZ_REG_DDDD - hhhh dddd - Broadcast d element
  • _MM_SWIZ_REG_DACB - hfeg dbac - Cross-product

v1_old

Source vector that retains old values of the destination vector; the resulting vector gets corresponding elements from v1_old for zero mask bits

k1

Writemask; only those elements of the source vectors with corresponding bit set to '1' in the k1 mask are computed and stored in the result

Description

Performs a swizzle transformation. It is used to permute elements in each 4-element lane of a 32-bit vector according to the swizzle parameter, s. The swizzle parameter specifies the order of elements in each lane. The DCBA swizzle parameter specifies no reordering. For example, if the elements of a source vectors are ponm lkji hgfe dcba, such as v[0]=a, v[1]=b, v[2]=c, …, v[15]=p then the result of y=_mm512_swizzle_epi32(v, _MM_SWIZ_REG_BADC) is a vector with all elements permuted, as follows: nmpo jilk fehg badc (y[0]=c, y[1]=d, y[2]=a, …, y[15]=n).

If the swizzle parameter is, for example, _MM_SWIZ_REG_BBBB then the resulting vector for the input vector from previous example is nnnn jjjj ffff bbbb.

The masked variant has two additional arguments : v1_old and k1. Only those elements in the source vector v with the corresponding bit set in vector mask k1 are used for computing. Those elements of vector v with the corresponding bit clear in vector mask k1 are not used in the computation. Instead, the corresponding element from v1_old is copied to the resulting vector.

Typical Usage

_mm512_add_epi32(v2, _mm512_swizzle_epi32(v3, _MM_SWIZ_ REG _DACB));

Returns

Returns the result of the swizzle operation on an int32 vector.