Intel® VTune™ Amplifier XE and Intel® VTune™ Amplifier for Systems Help
Use the Summary window as your starting point of the performance analysis with the Intel® VTune™ Amplifier. To access this window, select the General Explorationviewpoint and click the Summary sub-tab in the result tab.
Depending on the analysis type, the Summary window provides the following application-level statistics in the General Exploration viewpoint:
You may click the Copy to Clipboard button to copy the content of the selected summary section to the clipboard.
The first section displays the summary statistics on the overall application execution per hardware-related metrics measured in Pipeline Slots or Clockticks. Metrics are organized by execution categories. To view a metric description, mouse over the help icon :
In the example above, mousing over the L1 Bound metric displays the metric description in the tooltip.
A flagged metric value signals a performance issue for the whole application execution. Mouse over the flagged value to read the issue description:
You may use the performance issues identified by the VTune Amplifier as a baseline for comparison of versions before and after optimization. Your primary performance indicator is the Elapsed time value.
Grayed out metric values indicate that the data collected for this metric is unreliable. This may happen, for example, if the number of samples collected for PMU events is too low. In this case, when you hover over such an unreliable metric value, the VTune Amplifier displays a message:
You may either ignore this data, or rerun the collection with the data collection time, sampling interval, or workload increased.
All metric names are hyperlinks. Clicking such a hyperlink opens the Bottom-up window and sorts the data in the grid by the selected metric.
Explore the CPU Usage Histogram to analyze the percentage of the wall time the specific number of CPUs were running simultaneously.
Use This | To Do This |
---|---|
Vertical bars |
Hover over the bar to identify the amount of Elapsed time the application spent using the specified number of logical CPUs. |
Target Utilization |
Identify the target CPU usage. This number is equal to the number of logical CPUs. Consider this number as your optimization goal. |
Average CPU Usage |
Identify the average number of CPUs used aggregating the entire run. It is calculated as CPU time / Elapsed time. CPU usage at any point in time cannot surpass the available number of logical CPUs. Even when the system is oversubscribed, and there are more threads running then CPUs, the CPU usage is the same as the number of CPUs. Use this number as a baseline for your performance measurements. The closer this number to the number of logical CPUs, the better, except for the case when the CPU time goes to spinning. |
Utilization Indicator bar |
Analyze how the various utilization levels map to the number of simultaneously utilized logical CPUs. NoteIn the CPU Usage histogram, the VTune Amplifier treats the Spin and Overhead time as Idle CPU usage. Different analysis types may recognize Spin and Overhead time differently depending on availability of call stack information. This may result in a difference of CPU Usage graphical representation per analysis type. |
This section provides the following data:
Application Command Line |
Path to the target application. |
Operating System |
Operating system used for the collection. |
Computer Name |
Name of the computer used for the collection. |
Result Size |
Size of the result collected by the VTune Amplifier. |
Collection start time |
Start time (in UTC format) of the external collection. Explore the Timeline pane to track the performance statistics provided by the custom collector over time. |
Collection stop time |
Stop time (in UTC format) of the external collection. Explore the Timeline pane to track the performance statistics provided by the custom collector over time. |
CPU Information |
|
Name |
Name of the processor used for the collection. |
Frequency |
Frequency of the processor used for the collection. |
Logical CPU Count |
Logical CPU count for the machine used for the collection. |
Physical Core Count |
Number of physical cores on the system. |
User Name |
User launching the data collection. This field is available if you enabled the per-user event-based sampling collection mode during the product installation. |
GPU Information |
|
Name |
Name of the Graphics installed on the system. |
Vendor |
GPU vendor. |
Driver |
Version of the graphics driver installed on the system. |
Stepping |
Microprocessor version. |
EU Count |
Number of execution units (EUs) in the Render and GPGPU engine. This data is Intel® HD Graphics and Intel® Iris™ Graphics (further: Intel Graphics) specific. |
Max EU Thread Count |
Maximum number of threads per execution unit. This data is Intel Graphics specific. |
Max Core Frequency |
Maximum frequency of the Graphics processor. This data is Intel Graphics specific. |
Graphics Performance Analysis |
GPU metrics collection is enabled on the hardware level. This data is Intel Graphics specific. NoteSome systems disable collection of extended metrics such as L3 misses, memory accesses, sampler busyness, SLM accesses, and others in the BIOS. On some systems you can set a BIOS option to enable this collection. The presence or absence of the option and its name are BIOS vendor specific. Look for the Intel® Graphics Performance Analyzers option (or similar) in your BIOS and set it to Enabled. |