Intel® VTune™ Amplifier XE can be installed on Windows*, macOS*, and Linux* platforms and used for analysis of local and remote target systems. Use this tool to analyze the algorithm choices, find serial and parallel code bottlenecks, understand where and how your application can benefit from available hardware resources, and speed up the execution.

VTune Amplifier XE is available as a standalone product as well as part of the following suites:

PREVIEW: Intel Performance Snapshot gives you three quick ways to discover untapped performance:

Visit the VTune Amplifier training page for videos, webinars, and more to help you get started.

Prerequisites

For system requirements, see the product Release Notes.

Step 1: Start the VTune Amplifier

  1. Set up the environment variables:

    • csh/tcsh users: source <install_dir>/amplxe-vars.csh

    • bash users: source <install_dir>/amplxe-vars.sh

    By default, the <install_dir> is:
    • For root users: /opt/intel/vtune_amplifier_xe_2017

    • For non-root users: $HOME/intel/vtune_amplifier_xe_2017

  2. Launch the VTune Amplifier:
    • For standalone GUI interface, run the amplxe-gui command.

    • For command line interface, run the amplxe-cl command.

Step 2: Set Up the Analysis Target

  1. Build your target application in the Release mode with all optimizations enabled.

  2. Create a VTune Amplifier project:

    1. Click the menu button in the right corner and go to New > Project... .

    2. Specify the project name and location in the Create Project dialog box.

  3. In the Analysis Target tab, select a target system from the left pane and select an analysis target type from the right pane.

  4. Configure your target: application location, parameters, and search directories (if required).

Analysis Target tab

NEW: Select Arbitrary Targets to analyze a target that is not currently accessible from this host system. You can select a hardware platform and operating system from the list, create a command line analysis configuration, save it to the buffer, and run it later on the intended host. For more information, see Analysis Target Setup.

Step 3: Configure Analysis

  1. Switch to the Analysis Type tab.

  2. From the left pane, select an analysis type applicable to your platform and configure analysis options in the right pane.

  3. Click the Start button on the right to launch the analysis.

Analysis Type tab

Step 4: View and Analyze Performance Data

When data collection completes, the VTune Amplifier opens the result in the default viewpoint, which is a preset configuration of windows for an analysis result. You may switch between different viewpoint to analyze the data from different perspectives using different sets of performance metrics.

Start your analysis with the Summary window to get an overview of the application performance and then switch to other windows to explore the performance deeper at the granularity of function, source line and so on.

Hotspots by CPU Usage viewpoint

Key Features

ALGORITHM ANALYSIS

MICROARCHITECTURE ANALYSIS

  • Run General Exploration analysis to triage hardware issues in your application. This type collects a complete list of events for analyzing a typical client application.

    See the tutorial for a C++ sample code.

  • Use Memory Access analysis to identify memory-related issues, like NUMA problems and bandwidth-limited accesses, and attribute performance events to memory objects (data structures), which is provided due to instrumentation of memory allocations/de-allocations and getting static/global variables from symbol information.

    See the tutorial for a C++ sample code.

  • For systems with Intel® Software Guard Extensions (Intel SGX) feature enabled, run SGX Hotspots analysis to identify performance-critical program units inside security enclaves. This analysis type uses the the INST_RETIRED.PREC_DIST hardware event that emulates precise clockticks which is mandatory for the analysis on the systems with the Intel SGX enabled.

  • For the Intel processors supporting Intel® Transactional Synchronization Extensions (Intel TSX), run the TSX Exploration and TSX Hotspots analysis types to measure transactional success and analyze causes of transactional aborts.

PLATFORM ANALYSIS

  • Run CPU/GPU Concurrency analysis to identify code regions where your application is CPU or GPU bound.

  • Use GPU Hotspots analysis to identify GPU tasks with high GPU utilization and estimate the effectiveness of this utilization.

  • Analyze hot Intel® Media SDK programs and OpenCL™ kernels running on a GPU. For OpenCL application analysis, use the Architecture Diagram to explore GPU hardware metrics per GPU architecture blocks.

  • Run Disk Input and Output analysis to monitor utilization of the disk subsystem, CPU and processor buses. This analysis type provides a consistent view of the storage sub-system combined with hardware events and an easy-to-use method to match user-level source code with I/O packets executed by the hardware.

    See the tutorial for a C++ sample code.

COMPUTE-INTENSIVE APPLICATION ANALYSIS

  • Run HPC Performance Characterization analysis to identify how effectively your application uses CPU, memory, and floating-point operation hardware resources. The HPC Performance Characterization analysis type can be used as a starting point for understanding the performance aspects of your application. Additional scalability metrics are available for applications that use Intel OpenMP* or Intel MPI runtime libraries.

  • Run an Algorithm analysis type with the Analyze OpenMP regions option enabled to collect OpenMP or MPI data for applications using OpenMP or MPI runtime libraries. Note that HPC Performance Characterization analysis has the option enabled by default.

  • Start with the OpenMP Analysis section of the Summary window to identify inefficiencies in parallelization of your OpenMP application.

  • Analyze the Potential Gain metric values per OpenMP region to understand the maximum time that could be saved if the OpenMP region is optimized to have no load imbalance assuming no runtime overhead.

  • For hybrid OpenMP and MPI applications, explore OpenMP efficiency metrics by MPI processes laying on the critical path.

    See the tutorial for Linux Host - OpenMP and MPI hybrid sample code.

SOURCE ANALYSIS

  • Double click a hotspot function to drill down to the source code and analyze performance per source line or assembler instruction. By default, the hottest line is highlighted.

  • For help on an assembly instruction, right-click the instruction in the Assembly pane and select Instruction Reference from the context menu.

MANAGED CODE ANALYSIS

Configure target options for managed code analysis in the native, managed, or mixed mode:

  • Event-based sampling (EBS) or user-mode sampling and tracing analysis for Java* applications running in the Launch Application or Attach mode;

  • Basic Hotspots analysis for Python* applications running in the Launch Application and Attach to Process modes.

CUSTOM ANALYSIS

  • Select Custom Analysis branch in the analysis tree to create your own analysis configurations using any of the available VTune Amplifier data collectors.

  • Run your own custom collector from the VTune Amplifier to get the aggregated performance data, from your custom collection and VTune Amplifier analysis, in the same result.

  • Import performance data collected by your own or third-party collector into the VTune Amplifier result collected in parallel with your external collection. Use the Import from CSV button to integrate the external data to the result.

  • Collect data from a remote virtual machine by configuring KVM guest OS profiling, which makes use of the Linux Perf KVM feature. Select Analyze KVM guest OS from the Advanced options.

Training and Documentation

Document

Description

Online Training

The online training site is an excellent resource for learning VTune Amplifier basics with Getting Started guides, videos, tutorials, webinars and technical articles.

Intel VTune Amplifier Tutorials

Tutorials show you how to use basic VTune Amplifier features. VTune Amplifier tutorials guide a new user through basic walkthrough operations with a short sample. The tutorials provide an excellent foundation before you read the VTune Amplifier help.

The default installation location for the VTune Amplifier tutorials is <install-dir>/documentation/<locale>/tutorials.

Sample code is typically installed to <install-dir>/samples/<locale>/<programming_language>.

VTune Amplifier sample code and corresponding tutorials are also available at https://software.intel.com/en-us/product-code-samples

Release Notes

The Release Notes document contains the most up-to-date information about the product, including a product description, technical support, and known limitations and issues.

This document also contains system requirements for installing the product. Before installation, the Release Notes document is located at the root level (same level as the installation script/executable) of the installation download package.

This document is installed at: <install-dir>/documentation/<locale>/release_notes_amplifier_linux.pdf

Installation Guide

The Installation Guide contains basic installation instructions for VTune Amplifier and post-installation configuration instructions for the various drivers and collectors.

The latest Installation Guide can be found on the Intel® Developer Zone (Intel® DZ) website.

Intel VTune Amplifier Help

The help is the primary documentation for the VTune Amplifier. To view VTune Amplifier help, do the following:

  • From the product interface: Choose Intel VTune Amplifier XE 2017 Help from the Help menu, or click the Help button on the toolbar.
  • Outside the product interface: Open the index.htm file, which is installed at <install-dir>/documentation/<locale>/help.

Intel Processor Event Reference

This help provides reference information for Intel processor events used by the VTune Amplifier for hardware event-based sampling analysis. Most of this information is drawn from Intel processor information sources on the web. To access the Event Reference, choose Intel Processor Event Reference from the Help menu, or go to Intel VTune Amplifier XE 2017 Help > Reference > Intel Processor Events.

Command Line Help

You can access general help for VTune Amplifier command line interface by entering the following command line:

  • amplxe-cl -help for help on basic action options

  • amplxe-cl -help <action-option> for help on a particular action option and its knobs

Web Resources

Legal Information

Intel, the Intel logo, VTune and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries.

* Other names and brands may be claimed as the property of others.

Microsoft, Windows, and the Windows logo are trademarks, or registered trademarks of Microsoft Corporation in the United States and/or other countries.

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission from Khronos.

© 2017 Intel Corporation