Intel® C++ Compiler 16.0 User and Reference Guide

_mm512_i32[ext]scatter_ps/ _mm512_mask_i32[ext]scatter_ps

Scatter float32 vector with int32 indices. Corresponding instruction is VSCATTERDPS. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).

Syntax

Without Mask

extern void __cdecl _mm512_i32extscatter_ps(void* mv, __m512i index, __m512 v1, _MM_DOWNCONV_PS_ENUM conv, int scale, int hint);

extern void __cdecl _mm512_i32scatter_ps(void* mv, __m512i index, __m512 v1, int scale);

With Mask

extern void __cdecl _mm512_mask_i32extscatter_ps(void* mv, __mmask16 k1, __m512i index, __m512 v1, _MM_DOWNCONV_PS_ENUM conv, int scale, int hint);

extern void __cdecl _mm512_mask_i32scatter_ps(void* mv, __mmask16 k1, __m512i index, __m512 v1, int scale);

Parameters

v1

source vector

v1_old

Source vector that retains old values of the destination vector; the resulting vector gets corresponding elements from v1_old for zero mask bits

k1

Writemask; only those elements of the source vectors with corresponding bit set to '1' in the k1 mask are computed and stored in the result; elements in the result vector corresponding to zero bit in k1 are copied from corresponding elements of vector v1_old

index

int32 vector containing indexes in memory mv

mv

Pointer to base address in memory

conv

Type of downconversion, which can be one of the following:

  • _MM_DOWNCONV_PS_NONE - no conversion
  • _MM_DOWNCONV_PS_FLOAT16 - float32 => float16
  • _MM_DOWNCONV_PS_UINT8 - float32 => uint8
  • _MM_DOWNCONV_PS_SINT8 - float32 => sint8
  • _MM_DOWNCONV_PS_UINT16 - float32 => uint16
  • _MM_DOWNCONV_PS_SINT16 - float32 => sint16

scale

Scaling factor for calculating address of elements. Takes following values: 1, 2, 4, and 8. The address of the i-th element in memory is calculated as: mv + index[i] * scale

hint

Hint that indicates to the processor that the data is non-temporal. Takes the value 0 or 1, where:

  • _MM_HINT_NONE = 0
  • _MM_HINT_NT = 1 (Store is non-temporal)

Description

Down-converts and stores all 16 elements in float32 vector v1 to the memory locations to which the base address(es) mv and vector index index, with scale scale point.

The non-masked variant of the intrinsic is equivalent to the masked variant with full mask (k1=0xffff).

You can use the simplified version of this intrinsic, without ext in the name, if no conversion and no temporal hint are required.

Returns

None.