Intel® C++ Compiler 16.0 User and Reference Guide
Stores aligned float32 vector with a no-read hint. Corresponding instruction is VMOVNRAPS. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).
extern void __cdecl _mm512_storenr_ps(void* mt, __m512 v1); |
mt |
memory address to store vector elements (must be 64-byte aligned) |
v1 |
source vector to store elements from |
Stores 16 single precision floating point elements of float32 vector v1 to the memory address mt with a no-read hint.
This function is intended to speed up the case of stores in streaming kernels where we want to avoid wasting memory bandwidth by being forced to read the original content of entire cache lines from memory when we overwrite their whole contents completely.
Returns nothing.