Intel® VTune™ Amplifier XE and Intel® VTune™ Amplifier for Systems Help

Configuring GPU Analysis from Command Line

Use the -knob option for configuring Intel® VTune™ Amplifier to profile applications that use a Graphics Processing Unit (GPU) for rendering, video processing, and computations. GPU Analysis monitors overall GPU activity (graphics, media, and compute), collects Intel® HD Graphics and Intel® Iris™ Graphics hardware metrics, and then shows this data correlated with CPU processes and threads.

The following knobs are supported for GPU Analysis:

Knob Name

Supported Analysis Types

Description

enable-gpu-usage=true | false

runss, runsa

Analyze frame rate and usage of Processor Graphics engines.

gpu-counters-mode=none |overview | global-local-accesses | compute-extended | full-compute

gpu-hotspots, runss, runsa

Analyze performance data from Processor Graphics based on the GPU Metrics Reference.

  • overview - track general GPU memory accesses such as Memory Read/Write Bandwidth, GPU L3 Misses, Sampler Busy, Sampler Is Bottleneck, and GPU Memory Texture Read Bandwidth. These metrics can be useful for both graphics and compute-intensive applications.

  • global-local-accesses - include metrics that distinguish accessing different types of data on a GPU: Untyped Memory Read/Write Bandwidth, Typed Memory Read/Write Transactions, SLM Read/Write Bandwidth, Render/GPGPU Command Streamer Loaded, and GPU EU Array Usage. This metrics are useful for compute-intensive workloads on the GPU.

  • compute-extended - analyze GPU activity on the Intel processor code name Broadwell. This metrics set is disabled for other systems.

  • full-compute (preview) - collect both overview and compute-basic metrics with the allow-multiple-runs option enabled to analyze all types of EUs array stalled/idle issues in the same view.

This option is available only for supported platforms with the Intel Graphics Driver installed.

gpu-sampling-interval=<value in us>

gpu-hotspots, runss, runsa

Set the interval between GPU samples between 10 and 1000 microseconds. Default is 1000us. An interval of less than 100us is not recommended.

enable-gpu-runtimes=true | false

gpu-hotspots, runss, runsa

Capture the execution time of OpenCL™ kernels and Intel Media SDK programs on a GPU, identify performance-critical GPU computing tasks, and analyze the performance per GPU hardware metrics.

Note

OpenCL kernels analysis is currently supported for Windows and Linux target systems with Intel HD Graphics and Intel Iris Graphics. Intel® Media SDK Program Analysis Configuration is supported for Linux targets only and should be started with root privileges.

Examples

Example 1: Running Analysis for an Intel Media SDK Application

This example starts amplxe-cl as root and launches the GPU Hotspots analysis for an Intel Media SDK application:

$ amplxe-cl  -collect gpu-hotspots -knob enable-gpu-runtimes=true -r quadrant_r001 -- BitonicSort

Example 2: Running Analysis with Open CL Kernels Tracing

Perform GPU Hotspots or custom analysis, enabling the enable-gpu-usage knob to analyze GPU usage of a processor graphics engine, using the Overview gpu-counters-mode counter set, which is available only on a supported platform with an Intel Graphics Driver installed. Enable tracing of OpenCL kernels execution with the enable-gpu-runtimes option.

For example, to run GPU Hotspots analysis, collect GPU hardware metrics and trace OpenCL kernels on the BitonicSort application (-g is the option of the application), enter:

$ amplxe-cl -collect gpu-hotspots -knob gpu-counters-mode=overview -knob enable-gpu-runtimes=true -- BitonicSort -g

GPU Analysis on Android* System

Note

This analysis is supported with the VTune Amplifier for Systems only.

You can enable GPU Analysis for algorithm analysis types on Android systems with Intel HD Graphics and Intel Iris Graphics by using the following knobs:

Example

This example runs the GPU Hotspots analysis and monitors GPU usage.

host>./amplxe-cl -collect gpu-hotspots -target-system=android -r quadrant_r001 -target-process com.intel.fluid -knob enable-gpu-usage=true -knob gpu-counters-mode=overview

See Also