Intel® C++ Compiler 16.0 User and Reference Guide

Overview: Intel® Graphics Technology

This topic only applies to Intel® 64 and IA-32 architectures targeting Intel® Graphics Technology.

Many models of Intel processors that include Intel® Graphics Technology, can execute a reasonable amount of parallelizable work on the processor graphics. In many cases, you can enable this offloading by adding a minimal amount of code. When compiling, the Intel® C++ Compiler facilitates offloading existing scalar or parallel C/C++ code written for the CPU to the processor graphics.

Architecture and OS support for offloading to processor graphics is shown in the following table:

Build Architecture and OS

Executing Architecture and OS, for processors with Intel® Graphics Technology

Intel® 64, Linux*

Intel® 64, Linux

Intel® 64, Windows*

Intel® 64, Windows

IA-32, Windows

Note

In addition to a supported processor, running an application that offloads computing to the processor graphics requires the Intel® HD Graphics Driver to be installed to provide the necessary runtime support.

The compiler supports separate compilation and linking of target code that runs on Intel® Graphics Technology. The open source binutils package provides the linker support for linking the target kernel. See the Release Notes for more information.

The compiler generates sections of code to run natively on the CPU, that is, the host, and to offload to the processor graphics, that is, the target. The offload runtime refers to the runtime libraries used to organize offload operations to the target.

The compiler and the offload runtime enable the following:

Additionally, the compiler and the offload runtime facilitate the following:

Programming for Intel® Graphics Technology

When you compile a source file that contains offload extensions for Intel® Graphics Technology, the resulting object file contains the target object embedded within it. This object file is called a fat object. The name of the target object section is .gfxobj. When you link fat objects, the target executable is:

You can extract the target object or executable from a fat object or fat executable with the offload_extract tool.

The compiler provides the following language extensions to facilitate programming for Intel® Graphics Technology:

Name

Description

offload pragma

offload_attribute pragma

Pragmas to control the data transfer between the CPU and the processor graphics.

__GFX__ macro

__INTEL_OFFLOAD macro

Predefined macros you can use when programming for Intel® Graphics Technology.

target(gfx) attribute

target(gfx_kernel) attribute

Attributes to place variables and functions on the target.

Intrinsics and built-in functions

Built-in functions specifically supporting heterogeneous programming for Intel® Graphics Technology, as well as support for many existing CPU intrinsics.

API for asynchronous offloading

Functions to facilitate the organization of queued offloading of user-defined kernel functions and data sharing between the CPU and processor graphics.

Additionally, to offload a parallel loop to Intel® Graphics Technology, use Intel® Cilk™ Plus _Cilk_for loops or array notation.

There are two modes for sharing memory the CPU and the processor graphics:

The section in this document on Shared Virtual Memory explains some differences in programming necessary to use SVM mode.

Building for Intel® Graphics Technology

The compiler provides the following compiler options and environment variables that you can use when building a binary for Intel® Graphics Technology:

Compiler Option

Description

qoffload, Qoffload

Specifies the mode for offloading. The negative form of this option ignores language constructs for offloading.

qoffload-attribute-target, Qoffload-attribute-target

Flags every global routine and global data object in the source file with the offload attribute target(gfx).

qoffload-option, Qoffload-option

Specifies options to be used for the specified target and tool.

qoffload-arch, Qoffload-arch

Specifies the target architecture to use when offloading code.

mgpu-asm-dump, Qgpu-asm-dump

Generates a native assembly listing for the processor graphics code to be offloaded.

mgpu-arch, Qgpu-arch

Builds the offload code for graphics to run on a particular graphics processor as specified by the option value.

The following environment variables are only a few of the available environment variables for Intel® Graphics Technology:

Environment Variable

Description

GFX_CPU_BACKUP

Controls whether heterogeneous code is executed on the host when the target is not available.

GFX_MAX_THREAD_COUNT

Controls the maximum number of target threads to parallelize loop nests.

GFX_OFFLOAD_TIMEOUT

Controls execution time of offload tasks.

GFX_SHOW_TIME

Controls printing of timing information at the end of execution.

See Also