Intel® C++ Compiler 16.0 User and Reference Guide

_mm_i64gather_ps, _mm256_i64gather_ps

Gathers 2/4 packed single-precision floating point values from memory referenced by the given base address, qword indices and scale. The corresponding Intel® AVX2 instruction is VGATHERQPS.

Syntax

extern __m128 _mm_mask_i64gather_ps(float const * base, __m128i vindex, const int scale);

extern __m128 _mm256_mask_i64gather_ps(float const * base, __m256i vindex, const int scale);

Arguments

base

the base address used to reference the loaded FP elements.

vindex

the vector of qword indices used to reference the loaded FP elements.

scale

32-bit scale used to address the loaded FP elements.

Description

The intrinsics load 2/4 packed single-precision floating-point values from memory using qword indices and updates the destination operand. The intrinsic _mm_i64gather_ps() also sets the upper 64-bits of the result to '0'.

Below is the pseudo-code for the intrinsics:

_mm_i64gather_ps():

result[31:0] = mem[base+vindex[63:0]*scale];
result[63:32] = mem[base+vindex[127:64]*scale];
result[127:64] = 0;

_mm256_i64gather_ps():

result[31:0] = mem[base+vindex[63:0]*scale];
result[63:32] = mem[base+vindex[127:64]*scale];
result[95:64] = mem[base+vindex[191:128]*scale];
result[127:96] = mem[base+vindex[255:192]*scale];

Returns

A 128/256-bit vector with unconditionally gathered single-precision FP values.