Intel® C++ Compiler 16.0 User and Reference Guide

_mm512_packstorehi_pd/_mm512_mask_packstorehi_pd

Packs mask-enabled elements of float64 vector to form an unaligned float64 stream and stores that portion of the stream that maps to the high 64-byte aligned portion of the memory destination. Corresponding instruction is VPACKSTOREHPD. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).

Syntax

Without Mask

extern void __cdecl _mm512_packstorehi_pd(void* mt, __m512d v1);

With Mask

extern void __cdecl _mm512_mask_packstorehi_pd(void* mt, __mmask8 k1, __m512d v1);

Arguments

v1

source vector to store elements from

k1

vector mask to select elements to add to the stream

mt

memory location to store vector elements

Description

Packs the mask-enabled elements of float64 vector v1 into a float64 stream logically mapped starting at element-aligned address (mt − 64), and stores the high-64-byte elements of that stream (those elements of the stream that map at or after the first 64-byte-aligned address following (mt − 64), the high cache line in the current implementation). The length of the stream depends on the number of enabled masks, as elements disabled by the mask are not added to the stream.

The mask parameter k1 is not used as a writemask for this function. Instead, the mask is used as an element selector, choosing which elements are added to the stream.

In conjunction with _mm512_packstorelo_pd, this function is useful for packing data into a queue. Also in conjunction with _mm512_packstorelo_pd, it allows unaligned vector stores (that is, vector stores that are only element-wise, not vector-wise, aligned). The typical intrinsic sequence to perform an unaligned vector store would be:

_mm512_packstorelo_pd2(mt, v1);
_mm512_packstorehi_pd(mt+64, v1);

Returns

Returns nothing.