Intel® C++ Compiler 16.0 User and Reference Guide
Multiply float64 vectors. The corresponding instruction is VMULPD. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).
Without Mask extern _m512d __cdecl _mm512_mul_pd(_m512d v2, _m512d v3); |
With Mask extern _m512d __cdecl _mm512_mask_mul_pd(_m512d v1_old, __mmask8 k1, _m512d v2, _m512d v3); |
v2 |
float64 vector multiplied to float64 vector v3 |
v3 |
float64 vector multiplied to float64 vector v2; can contain the result of a swizzle/broadcast/conversion process on a memory or float64 vector |
v1_old |
Source vector that retains old values of the destination vector; the resulting vector gets corresponding elements from v1_old for zero mask bits |
k1 |
Writemask; only those elements of the source vectors with corresponding bit set to '1' in the k1 mask are computed and stored in the result; elements in the result vector corresponding to zero bit in k1 are copied from corresponding elements of vector v1_old |
Performs an element-by-element multiplication between float64 vector v2 and the float64 vector v3.
The masked variant has one additional argument: k1. Only those elements in source registers with the corresponding bit set in vector mask k1 are used for computing. The remaining elements of the resulting vector are filled with corresponding elements from v1_old.
Returns the result of the multiplication operation.