Intel® C++ Compiler 16.0 User and Reference Guide
This topic only applies to Intel® 64 and IA-32 architectures targeting Intel® Graphics Technology.
The compiler provides two heterogeneous offload programming models that enable you to use the processor graphics:
Synchronous offload:
Access this model using a parallel _Cilk_for loop as the parallel loop under #pragma offload.
The CPU waits for the offload task to complete before continuing execution.
The compiler handles data sharing and kernel creation based on an offload region containing a parallel _Cilk_for loop.
Asynchronous offload:
Access this model using an API.
The CPU continues execution until it is requested to wait for a kernel to complete.
You have more control over data sharing and kernel enqueueing. Data sharing and kernel enqueueing are separate, so multiple kernels can share data.
The Intel® Graphics Technology runtime and the gfx_rt.h header file provide an Asynchronous API to organize queued offload of user-defined kernel functions and data sharing between the CPU and prodcessor graphics, with little extra programming effort. You can use this API in conjunction with named or direct kernels written using _Cilk_for as the parallel loop in the kernel entry point.
The API includes the following functions:
Name |
Description |
---|---|
GfxTaskId _GFX_offload |
Putting the task into the in-order offload queue |
_GFX_wait |
Waiting for task completion |
_GFX_share _GFX_unshare |
Managing shared linear data |
GfxImage2D (C++ interface, class constructor) GfxSharedImage2D (C++ interface, class constructor) GfxResourceHandle _GFX_create_image_2d (C interface) _GFX_close_resource_handle (C interface) |
Creating and destroying 2D imagesfor processor graphics operations |
GfxImage2D::write, (C++ interface) GfxSharedImage2D::write, (C++ interface) GfxImage2::read (C++ interface) GfxSharedImage2::read (C++ interface) _GFX_read_image_2d (C interface) _GFX_write_image_2d (C interface) |
Synchronizing the content of 2D images between CPU and GPU |