Intel® C++ Compiler 16.0 User and Reference Guide

_mm512_fmadd233_ps/ _mm512_mask_fmadd233_ps/ _mm512_fmadd233_round_ps/ _mm512_mask_fmadd233_round_ps

Multiply and add float32 vectors. The corresponding instruction is VMADD233PS. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).

Syntax

Without Mask

extern __m512 __cdecl _mm512_fmadd233_ps(__m512 v2, __m512 v3);

extern __m512 __cdecl _mm512_fmadd233_round_ps(__m512 v2, __m512 v3 int rc);

With Mask

extern __m512 __cdecl _mm512_mask_fmadd233_ps(__m512 v1_old, __mmask16 k1, __m512 v2, __m512 v3);

extern __m512 __cdecl _mm512_mask_fmadd233_round_ps(__m512 v1_old, __mmask16 k1, __m512 v2, __m512 v3 int rc);

Parameters

v2

float32 vector multiplied to certain elements of float32 vector v3

v3

Certain elements of float32 vector are multiplied to float32 vector v2; certain elements are then added to the product of v2 and v3

v1_old

Source vector that retains old values of the destination vector; the resulting vector gets corresponding elements from v1_old for zero mask bits

k1

Writemask; only those elements of the source vectors with corresponding bit set to '1' in the k1 mask are computed and stored in the result; elements in the result vector corresponding to zero bit in k1 are copied from corresponding elements of vector v1_old

rc

Rounding control values; these can be one of the following:

  • _MM_FROUND_TO_NEAREST_INT - rounds to nearest even
  • _MM_FROUND_TO_NEG_INF - rounds to negative infinity
  • _MM_FROUND_TO_POS_INF - rounds to positive infinity
  • _MM_FROUND_TO_ZERO - rounds to zero
  • _MM_FROUND_CUR_DIRECTION - rounds using default from MXCSR register

Description

Performs a multiplication between float32 vector v2 and certain elements of float32 vector v3, then adds the result to certain elements of float32 vector v3. Intermediate values are calculated to infinite precision, and are not truncated or rounded, unless you specify the rc parameter.

The masked variant has one additional argument: k1. Only those elements in source registers with the corresponding bit set in vector mask k1 are used for computing. The remaining elements of the resulting vector are filled with corresponding elements from v1_old.

Returns

Returns the result of the multiplication-addition operation.