Intel® VTune™ Amplifier XE and Intel® VTune™ Amplifier for Systems Help
To run an analysis from command line, use any of the following actions:
To run an analysis from the command line, do the following:
For a list of available predefined analysis types, enter:
amplxe-cl -help collect
Intel® VTune™ Amplifier displays all collection options and provides a list of available predefined analyses that can be categorized into the following analysis modules:
Analysis Type |
Description |
---|---|
hotspots | Analyze application flow and identify sections of code that take a long time to execute (hotspots). |
advanced-hotspots | Extend the hotspots analysis by collecting call stacks, context switch and statistical call count data as well as analyzing the CPI (Cycles Per Instruction) metric. |
concurrency | Collect data on how an application is using available logical CPU cores, discover where parallelism is incurring synchronization overhead, and identify potential candidates for parallelization. |
locksandwaits | Identify where an application is waiting on synchronization objects or I/O operations, and discover how these waits affect the application performance. |
hpc-performance | Identify opportunities to optimize CPU, memory, and FPU utilization for compute-intensive or throughput applications. The HPC Performance Characterization analysis type is a starting point for understanding the performance landscape of your application. Use this analysis type to improve application performance by increasing the number of floating-point operations per second (GFLOPS) and reducing the overall application run time. The analysis collects data related to CPU, memory, and FPU utilization. Additional scalability metrics are available for applications that use OpenMP* or MPI runtime libraries. |
general-exploration | Collect hardware events for analyzing a typical client application. This analysis calculates a set of predefined ratios used for the metrics and facilitates identifying hardware-level performance problems. |
memory-access | Identify memory-related issues, like NUMA problems and bandwidth-limited accesses, and attribute performance events to memory objects (data structures), which is provided due to instrumentation of memory allocations/de-allocations and getting static/global variables from symbol information. |
sgx-hotspots | Analyze hotspots inside security enclaves for systems with the Intel Software Guard Extensions (Intel SGX) feature enabled. This analysis type uses the INST_RETIRED.PREC_DIST hardware event that emulates precise clockticks and helps identify performance-critical program units inside enclaves. |
tsx-exploration | Collect events that help understand Intel Transactional Synchronization Extensions (Intel TSX) behavior and causes of transactional aborts. |
tsx-hotspots | Monitor the UOPS_RETIRED.ALL_PS hardware event that emulates precise clockticks and identify performance-critical program units inside transactions. |
cpugpu-concurrency | Explore code execution on the various CPU and GPU cores in your system, correlate CPU and GPU activity and identify whether your application is GPU or CPU bound. |
gpu-hotspots | Identify Graphics Processing Unit (GPU) tasks with high GPU utilization and estimate the effectiveness of this utilization. This analysis type is intended for analysis of applications that use a GPU for rendering, video processing, and computations with explicit support of Intel® Media SDK and OpenCL™ software technology. |
disk-io | Monitor utilization of the disk subsystem, CPU and processor buses. This analysis type uses the hardware event-based sampling collection and system-wide Ftrace* collection to provide a consistent view of the storage sub-system combined with hardware events and an easy-to-use method to match user-level source code with I/O packets executed by the hardware. |
system-overview | NoteThe System Overview analysis is supported only with the Intel® VTune™ Amplifier for Systems. Evaluate general behavior of Linux* or Android* target systems and correlate power and performance metrics with IRQ handling. |
For a list of available custom analysis types, enter:
amplxe-cl -help collect-with
Intel® VTune™ Amplifier displays all collection options and provides a list of available collection types that can used for custom analysis:
Collector | Description |
---|---|
runsa | Profile your application using the counter overflow feature of the Performance Monitoring Unit (PMU). |
runss | Profile the application execution and take snapshots of how that application utilizes the processors in the system. The collector interrupts a process, collects the value of all active instruction addresses and captures a calling sequence for each of these samples. |
To run a predefined performance analysis from the command line, enter:
amplxe-cl -collect <analysis_type> [-knob <knobName=knobValue>] [--] <target>
where
<analysis_type> is the type of analysis to run. To see the list of available analysis types, enter:
amplxe-cl -help collect
-knob is a configuration option that modifies the analysis
[knobName=knobValue] is the name of the specified knob and its value
<target> is the path and name of the application to analyze
To run a custom analysis from the command line, enter:
amplxe-cl -collect-with <collection_type> [-knob <knobName=knobValue>] [--] <target>
where
<collection_type> is the type of analysis to run. To see the list of available collection types, enter:
amplxe-cl -help collect-with
<-knob> is an option that configures the analysis
[knobName=knobValue] is the name of specified knob and its value
<target> is the path and name of the application to analyze
After collecting performance results for your target, you can view the results in the GUI or generate a formatted analysis report.