Intel® C++ Compiler 16.0 User and Reference Guide
This topic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).
Streams can be used to offload multiple concurrent computations to a device on Intel® MIC Architecture from a single CPU thread.
A stream is a logical queue of offloads. Offloads in any one stream complete in the order in which they were issued to the stream.
To use this feature, specify the stream clause in #pragma offload or #pragma offload_transfer.
To specify a wait for all offloads to the stream, specify the stream clause in #pragma offload_wait .
The following API creates a stream and specify the number of threads allocated to it:
OFFLOAD_STREAM* handle = _Offload_stream_create( int device, // Intel® MIC Architecture device number int number_of_cpus); // Threads allocated to the stream
After a stream has been created, it is tied to a target device. So, there is no need to specify a device whenever offloading to a stream; this is in contrast to non-stream offloads, which always require a target device specification.
The following API destroys a stream and returns the device threads to the pool for future streams:
int _Offload_stream_destroy( _Offload_stream stream); // The stream
This API returns true if the stream was successfully destroyed.
Offloads can be issued to a stream using the following syntax:
// Issue an offload to a stream #pragma offload … stream(handle)
You can use a signal clause to identify a particular offload issued to a stream. The signal identifier can later be used to wait for completion of that specific offload; for example
// Issue offload to a stream and identify with a signal #pragma offload … stream(handle) signal(s)
A wait can be specified for all offloads in a stream or for a particular offload issued to a stream.
The following example shows how to specify a wait for completion of all offloads in a stream:
// Issue offload to a stream #pragma offload … stream(handle) { … } … // Issue another offload to that stream #pragma offload … stream(handle) { … } … // Wait for all offloads in that stream to complete #pragma offload_wait stream(handle)
The following example shows how to specify a wait for completion of a particular offload in a stream:
// Issue offload to a stream and identify with signal value s1 #pragma offload … stream(handle) signal(s1) { … } … // Issue offload to a stream and identify with signal value s2 #pragma offload … stream(handle) signal(s2) { … } … // Wait for offload with signal value s1 to complete #pragma offload_wait stream(handle) wait(s1)
The following example shows how to specify a wait for completion of all offloads in all streams:
// Issue offload to a stream 1 #pragma offload … stream(handle1) { … } … // Issue offload to a stream 2 #pragma offload … stream(handle2) { … } … // Wait for completion of all offloads in all streams using handle 0 #pragma offload_wait stream(0)
This feature includes non-blocking APIs that return a Boolean value to test whether:
All stream offloads to a specific device have completed
All stream offloads on all devices have completed
A particular stream offload has completed
The following function tests whether all offloads to the specified stream have completed. Specifying a 0 for stream will test whether offloads to all streams have completed.
int _Offload_stream_completed( _Offload_stream stream); // The stream
The following function tests whether all offloads to the specified device have completed. Specifying -1 for the device will test whether all stream offloads on all devices have completed.
int _Offload_device_streams_completed( int device); // Intel® MIC Architecture device number
The following example shows how to check for completion of all offloads in a stream:
// Issue offload to a stream #pragma offload … stream(handle) { … } … // Issue another offload to that stream #pragma offload … stream(handle) { … } … // Check if all offloads in that stream have completed if (_Offload_stream_completed(handle)) …
The following example shows how to check for a particular offload in a stream:
// Issue offload to a stream and identify with signal value s1 #pragma offload … stream(handle) signal(s1) { … } … // Issue offload to a stream and identify with signal value s2 #pragma offload … stream(handle) signal(s2) { … } … // Check if offload with signal value s1 has completed if (_Offload_signaled(s1)) …
The following example shows how to check for completion of all offload in all streams:
// Issue offload to a stream 1 #pragma offload … stream(handle1) { … } … // Issue offload to a stream 2 #pragma offload … stream(handle2) { … } … // Check for completion of all offloads in all streams using handle 0 if (_Offload_stream_completed(0)) …
The following example shows how to check for completion of all offloads on a device:
// Issue offload to a stream 1 #pragma offload … stream(handle1) { … } … // Issue offload to a stream 2 #pragma offload … stream(handle2) { … } … // Check for completion of all stream offloads on device 2 if (_Offload_device_streams_completed(2) …
The following example shows how to check for completion of all offloads on all devices:
// Issue offload to a stream 1 #pragma offload … stream(handle1) { … } … // Issue offload to a stream 2 #pragma offload … stream(handle2) { … } … // Check for completion of all stream offloads on device 2 if (_Offload_device_streams_completed(-1) … OR if (_Offload_stream_completed(0) …