Intel® C++ Compiler 16.0 User and Reference Guide
Adds and negates sum of float32 vectors. The corresponding instruction is VADDNPS. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).
Without Mask extern _m512 __cdecl _mm512_addn_ps(_m512 v2, _m512 v3); |
With Mask extern _m512 __cdecl _mm512_mask_addn_ps(_m512 v1_old, __mmask16 k1, _m512 v2, _m512 v3); |
v2 |
float32 vector used for the addition operation |
v3 |
float32 vector also used for addition operation |
v1_old |
Source vector that retains old values of the destination vector; the resulting vector gets corresponding elements from v1_old for zero mask bits |
k1 |
Writemask; only those elements of the source vectors with corresponding bit set to '1' in the k1 mask are computed and stored in the result; elements in the result vector corresponding to zero bit in k1 are copied from corresponding elements of vector v1_old |
Performs an element-by-element addition between float32 vector v2 and float32 vector v3, and negates the sum.
The masked variant has two additional arguments: v1_old and k1. Those elements of v2 and v3 with the corresponding bit clear in vector mask k1 are not used in the computation. Instead, the corresponding element from v1_old is copied to the resulting vector.
Returns the result of the addition operation.