Intel® C++ Compiler 16.0 User and Reference Guide

_mm512_addsets_round_ps/ _mm512_mask_addsets_round_ps

Adds rounded float32 vectors and sets mask to sign. The corresponding instruction is VADDSETSPS. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).

Syntax

Without Mask

extern _m512 __cdecl _mm512_addsets_round_ps(_m512 v2, _m512 v3, __mmask16* sign, int rc);

With Mask

extern _m512 __cdecl _mm512_mask_addsets_round_ps(_m512 v1_old, __mmask16 k1, _m512 v2, _m512 v3, __mmask16* sign, int rc);

Parameters

v2

float32 vector used for the compare operation

v3

float32 vector used for the compare operation

v1_old

Source vector that retains old values of the destination vector; the resulting vector gets corresponding elements from v1_old for zero mask bits

k1

Writemask; only those elements of the source vectors with corresponding bit set to '1' in the k1 mask are computed and stored in the result

sign

pointer to the location in the k1 mask register where the sign of the result is stored

rc

Rounding control values; these can be one of the following:

  • _MM_FROUND_TO_NEAREST_INT - rounds to nearest even
  • _MM_FROUND_TO_NEG_INF - rounds to negative infinity
  • _MM_FROUND_TO_POS_INF - rounds to positive infinity
  • _MM_FROUND_TO_ZERO - rounds to zero
  • _MM_FROUND_CUR_DIRECTION - rounds using default from MXCSR register

Description

Performs an element-by-element addition between float32 vector v2 and float32 vector v3. Intermediate values are rounded according to rc value.

The sign of the sum is returned in sign.

The masked variant has two additional arguments: v1_old and k1. Those elements of v2 and v3 with the corresponding bit clear in vector mask k1 are not used in the computation. Instead, the corresponding element from v1_old is copied to the resulting vector.

Returns

Returns the result of the addition operation. The sign of the result is returned in sign.