Intel® VTune™ Amplifier 2017
After building the sample application and collecting baseline performance data for it, rerun it under the scrutiny of
Intel® VTune™ Amplifier to discover what parts of the code are being most used. Advanced Hotspots analysis collects event and IP (Instruction Pointer) information to reveal evidence of a basic set of hardware issues induced by the application code that may be affecting its performance.
Prerequisites: You created a project and specified your sample application as an Intel Xeon Phi coprocessor (native) target in the Analysis Target tab.
To run the analysis:
Click the Choose Analysis button on the right, or switch to the Analysis Type tab.
VTune Amplifier automatically detects your target system configuration and displays analysis types applicable to the Intel® Xeon Phi™ coprocessor.
From the analysis tree on the left, select the Algorithm Analysis > Advanced Hotspots analysis type.
The Advanced Hotspots predefined configuration opens on the right.
Click the Start button on the right to run the analysis.
VTune Amplifier starts the matrix.mic application on the Intel Xeon Phi coprocessor card via SSH connection. The application calculates a large matrix multiply before exiting. When the application exits or after a predefined interval, depending on how the collection run was configured, collection is completed and the VTune Amplifier enters its finalization process, where data are coalesced, symbols are reconnected to their addresses, and certain data are cached to speed the display of results.
To make sure the performance of the application is repeatable, go through the entire tuning process on the same system with a minimal amount of other software executing.