Intel® VTune™ Amplifier XE and Intel® VTune™ Amplifier for Systems Help

Advanced Hotspots Analysis

Advanced Hotspots analysis is a fast and easy way to identify performance-critical code sections (hotspots). The periodic instruction pointer sampling performed by Intel® VTune™ Amplifier identifies code locations where an application spends more time than in others. A function may consume much time either because its code is slow or because the function is frequently called. But any improvements in the speed of such functions should have a bigger impact on overall application performance.

VTune Amplifier creates a list of functions in your application ordered by the amount of time spent in each function. By default, Advanced Hotspots analysis does not capture the function call stacks as the hotspots are collected, but it can be used to sample all processes on the system. This type of analysis uses event-based sampling collection and analyzes all the processes running on your system at the moment, providing CPU time data on whole system performance.

You still can analyze stacks for your application modules by selecting the collection level that includes stack analysis in the Advanced Hotspots pane. For example, selecting the Hotspots, call counts and stacks collection level extends the Advanced Hotspots analysis with performance, parallelism and power consumption data attributed to execution paths.

To use the Advanced Hotspots analysis, explore:

Configuration Options

To configure options for the Advanced Hotspots analysis:

  1. Click the New Analysis button on the Intel® VTune™ Amplifier toolbar.

    The New Amplifier Result tab opens with the Analysis Type window active.

  2. From the analysis tree on the left pane, select Algorithm Analysis > Advanced Hotspots.

    The analysis configuration pane opens on the right.

  3. Configure the following options:

    CPU sampling interval, ms field

    Specify an interval (in milliseconds) between CPU samples.

    Possible values - 0.01-1000.

    The default value is 1.

    Collection Level options

    Select a level of details provided with event-based sampling collection. Detailed collection levels cause higher overhead.

    • Hotspots, call counts and stacks
    • Hotspots
    • Hotspots and stacks
    • Hotspots, call counts, loop trip counts and stacks

    The default value is Hotspots.

    Event mode drop-down menu

    Limit event-based sampling collection to OS or USER mode.

    • All
    • OS
    • USER

    The default value is All.

    Analyze user tasks, events, and counters check box

    Analyze the tasks, events, and counters specified in your code via the ITT API. This option causes a higher overhead and increases the result size.

    The default value is false.

    Analyze OpenMP regions check box

    Instrument and analyze OpenMP regions to detect inefficiencies such as imbalance, lock contention, or overhead on performing scheduling, reduction and atomic operations.

    The default value is false.

    Details button

    Expand/collapse a section listing the default non-editable settings used for this analysis type. If you want to modify these settings for the analysis, you need to create a custom configuration by right-clicking the analysis entry in the analysis tree and selecting Copy from Current from the context menu. VTune Amplifier creates an editable copy of this analysis type configuration and locates it under the Custom Analysis branch in the analysis tree.

    Note

    You may generate the command line for this configuration using the Command Line... button at the bottom.

Viewpoints

You can choose to view Advanced Hotspots analysis results in any of the following viewpoints:

Viewpoint

Description

Hardware Events

Displays statistics of monitored hardware events: estimated count and/or the number of samples collected. Use this view to identify code regions (modules, functions, code lines, and so on) with the highest activity for an event of interest.

Hardware Issues

Helps identify where the application is not making the best use of available hardware resources. This viewpoint displays metrics derived from hardware performance counters. Hover over the highlighted metrics values in the grid to read why the extreme value might represent a performance problem.

Hotspots

Helps identify hotspots - code regions in the application that consume a lot of CPU time.

Memory Usage

Helps understand how effectively your application uses memory resources and identify potential memory access related issues like excessive access to remote memory on NUMA platforms, hitting DRAM or Intel® QuickPath Interconnect (Intel QPI) bandwidth limit, and others. It provides various performance metrics for both the application code and memory objects arrays.

These viewpoints may include the following windows:

What's Next

You can go from the hotspots to the source code. View the source code containing the hotspots and modify your code to remove bottlenecks and improve the performance of your application.

Information provided by Advanced Hotspots analysis is important for tuning serial applications and it is still useful for tuning the serial sections of parallel applications. For algorithm tuning, you may also choose to run the Basic Hotspots analysis and analyze the call flow of the application or run the Concurrency analysis to estimate the effectiveness of the parallel algorithms you use. For Intel® Xeon Phi™ coprocessor analysis, you may run the General Exploration analysis with additional metrics that help triage hardware issues in programs running on the coprocessor.

See Also