Intel® VTune™ Amplifier 2017
Before you start identifying hotspots in your native Intel® Xeon
Phi™ coprocessor application, do the following:
You need the following tools to try these tutorial steps yourself using the matrix sample application:
Intel® VTune™ Amplifier, including sample applications
Sampling driver, set up during the VTune Amplifier installation
If, for some reason, the VTune Amplifier was not able to install the driver, you will not be able to run the analysis and will see a warning message. See online help for additional instructions how to install the driver manually.
Intel® Manycore Platform Software Stack (Intel® MPSS). See Release Notes for more information.
tar file extraction utility
Intel® C++ Compiler installed on the host. See Release Notes for more information.
Acquire Intel VTune Amplifier
If you do not already have access to the VTune Amplifier, you can download an evaluation copy from http://software.intel.com/en-us/articles/intel-software-evaluation-center/.
Install and Set Up VTune Amplifier Sample Applications
Copy the matrix_vtune_amp_xe.tgz file from the <install_dir>/samples/<locale>/C++ directory to a writable directory or share on your system.
The default installation path for the VTune Amplifier XE is /opt/intel/vtune_amplifier_xe_version. For the VTune Amplifier for Systems, the default <install_dir> is:
Extract the sample from the .tgz file.
Build the target on the host with full optimizations, which is recommended for performance analysis.
Browse to the linux directory within where you extracted the sample code (for this example assume that location is /home/sample/matrix/linux). Make sure this directory contains Makefile.
Set up the environment for Intel C++ Compiler:
source <path_to_compiler_bin>/compilervars.sh intel64
Build the code using the make command:
$ make mic
The matrix application is built as matrix.mic and stored in the matrix/linux directory.
This application uses OpenMP* library for compilation. To run the sample on the Intel Xeon Phi coprocessor, make sure to copy the OpenMP library to the card and set up the default path.
To communicate with the Intel Xeon Phi coprocessor cards, you may use any of the following mechanisms:
Ensure that the binary to analyze is copied to the Intel Xeon Phi coprocessor. You can do this by using scp, for example:
scp matrix.mic mic0:/tmp
You may add this command to build scripts to automate a copy action after the binary recompilation. In this tutorial's scenario, scp command is added to the Makefile. So, the matrix application is built and automatically copied to the Intel Xeon Phi coprocessor.
Run the application on the coprocessor using ssh and record the results to establish a performance baseline:
Note the execution time displayed at the bottom. For the matrix.mic executable in the figure above, the execution time is 30.466 seconds. Use this metric as a baseline against which you will compare subsequent runs of the application.
Run the application several times, noting the execution time for each run, and use the average time. This helps to minimize skewed results due to transient system activity.
If you experience a problem with permissions to run the commands, use sudo or root access.
Alternatively, you may create an ssh script to copy and launch your application on a card or use the micnativeloadex utility. For details, see the Preparing an Intel® Xeon Phi™ Coprocessor System for Analysis online help topic.
Optimization Notice |
---|
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 |