Intel® VTune™ Amplifier XE and Intel® VTune™ Amplifier for Systems Help
GPU Hotspots analysis is intended for applications that use a Graphics Processing Unit (GPU) for rendering, video processing, and computations with explicit support of Intel® Media SDK and OpenCL™ software technology.
Use this analysis type to identify GPU tasks with high GPU utilization and estimate the effectiveness of this utilization. The tool infrastructure automatically aligns clocks across all cores in the entire system so that you can analyze some CPU-based workloads together with GPU-based workloads within a unified time domain.
Prerequisites: For Linux* targets, to analyze Intel HD and Intel Iris Graphics (further: Intel Graphics) hardware events on a GPU, make sure to install the Intel Media Server Studio (starting with version 2015 R5) and build the kernel driver as described in the Getting Started Guide.
Use the GPU Hotspots analysis to:
Identify how effectively your application uses OpenCL kernels (for Linux* and Window* targets only)
Analyze execution of Intel Media SDK tasks over time (for Linux targets only)
Explore GPU usage and analyze a software queue for GPU engines at each moment of time
Explore the performance of your application per selected GPU metrics over time
To run the GPU Hotspots analysis, explore:
Configuration options (knobs)
To view configuration options for the GPU Hotspots analysis:
Click the New Analysis toolbar button.
The Analysis Type window opens.
From the left pane, select Platform Analysis > GPU Hotspots.
The GPU Hotspots configuration pane opens on the right displaying editable and predefined collection options for this analysis.
The GPU Hotspots analysis is pre-configured to collect GPU usage data, analyze GPU task scheduling and identify whether your application is CPU or GPU bound.
Configure the following GPU analysis options:
Option |
Description |
Supported Target System |
Supported Graphics |
---|---|---|---|
GPU sampling internal, ms field |
Specify an interval between GPU samples. |
All |
All |
Analyze Processor Graphics hardware events |
Monitor the Render and GPGPU engine usage, identify which parts of the engine are loaded, and correlate GPU and CPU data. VTune Amplifier provides platform-specific presets of the hardware metrics. All presets collect data about execution units (EUs) activity: EU Array Active, EU Array Stalled, EU Array Idle, Computing Threads Started, and Core Frequency.
|
Windows*, Linux* (see the prerequisites above) and Android* |
Intel® HD Graphics and Intel® Iris™ Graphics only (further: Intel Graphics) |
Trace OpenCL and Intel Media SDK programs |
Explore execution time for runtimes, monitor performance of each program per GPU metrics and identify hotspots. |
OpenCL kernels analysis: Windows and Linux Intel Media SDK program analysis: Linux |
Intel Graphics only |
Select the Collect stacks option to analyze performance and parallelism per execution path.
To run the GPU Hotspots analysis from the command line, enter:
$ amplxe-cl -collect gpu-hotspots [-knob <knob_name=knob_option>] -- <target> [target_options]
You may generate the command line for this configuration using the Command Line... button at the bottom.
VTune Amplifier runs the analysis and opens the data in the GPU Hotspots viewpoint providing various platform data in the following windows:
Summary window displays overall and per-engine GPU usage, percentage of time the EUs were stalled or idle with potential reasons for this, and the hottest GPU computing tasks.
Graphics window displays CPU and GPU usage data per thread and provides an extended list of GPU hardware metrics that help analyze accesses to different types of GPU memory.
Platform window displays overtime data as GPU usage on a software queue, CPU time usage, OpenCL kernels data, and GPU performance per the selected group of GPU hardware metrics, DRAM Bandwidth, and Core Frequency.
Bottom-up window displays hotspot GPU computing tasks in the bottom-up tree, GPU metrics, and, if collected, call stacks.