Intel® C++ Compiler 16.0 User and Reference Guide

_mm512_prefetch_i32[ext]scatter_ps/ _mm512_mask_prefetch_i32[ext]scatter_ps

Scatter prefetch float32 vector with int32 indices. Corresponding instructions are VSCATTERPF0DPS and VSCATTERPF1DPS. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).

Syntax

Without Mask

extern void __cdecl _mm512_prefetch_i32extscatter_ps(void* mv, __m512i index, _MM_UPCONV_PS_ENUM conv, int scale, int pf_hint);

extern void __cdecl _mm512_prefetch_i32scatter_ps(void* mv, __m512i index, int scale, int pf_hint);

With Mask

extern void __cdecl _mm512_mask_prefetch_i32extscatter_ps(void* mv, __mmask16 k1, __m512i index, _MM_UPCONV_PS_ENUM conv, int scale, int pf_hint);

extern void __cdecl _mm512_mask_prefetch_i32scatter_ps(void* mv, __mmask16 k1, __m512i index, int scale, int pf_hint);

Parameters

k1

Writemask; Only those elements of the source memory with corresponding bit set to '1' in the k1 writemask are prefetched.

index

int32 vector containing indices in memory mv.

mv

Pointer to base address in memory

conv

Type of upconversion, which can be one of the following:

  • _MM_UPCONV_PS_NONE - no conversion
  • _MM_UPCONV_PS_FLOAT16 - sint8 => float32
  • _MM_UPCONV_PS_UINT8 - uint8 => float32
  • _MM_UPCONV_PS_SINT8 - sint8 => float32
  • _MM_UPCONV_PS_UINT16 - uint16 => float32
  • _MM_UPCONV_PS_SINT16 - sint16 => float32

scale

Scaling factor for calculating address of elements. Takes following values: 1, 2, 4, and 8. The address of the i-th element in memory is calculated as: mv + index[i] * scale

pf_hint

Prefetch hint. Takes one of the following values:

  • _MM_HINT_T0 – prefetch into L1 with T0 hint
  • _MM_HINT_T1 – prefetch into L2 with T1 hint
  • _MM_HINT_T2 – prefetch into L2 with T1 and non-temporal hints
  • _MM_HINT_NTA – prefetch into L1 with T0 and non-temporal hints

Description

A set of 16 memory locations, to which base address mv points, and int32 index vector index with scale scale, are prefetched from memory to L1 or L2 level of cache, depending on the pf_hint parameter, with a request for exclusive ownership.

The non-masked variant of the intrinsic is equivalent to the masked variant with full mask (k1=0xffff).

You can use the simplified versions of this intrinsic, without ext in the name, if no up-conversion is required.

Returns

None.