Intel® VTune™ Amplifier XE and Intel® VTune™ Amplifier for Systems Help

Intel® Xeon Phi™ Coprocessor Analysis Workflow

Note

This type of analysis is supported by the Intel® VTune™ Amplifier XE only.

The following figure shows basic steps required to analyze an application running on Intel® Xeon Phi™ coprocessors based on Intel Many Integrated Core Architecture (Intel® MIC Architecture) or perform a system-wide analysis. Depending on the coprocessor, you may choose to run one of the predefined analysis types such as Advanced Hotspots, Memory Access, General Exploration, HPC Performance Characterization or create a custom analysis type.

Prerequisites: Build the target on the host with full optimizations, which is recommended for performance analysis.

1.

Prepare your coprocessor system for analysis

  • Install the sampling server and driver on an Intel Xeon Phi coprocessor card to be sampled.

  • For native application analysis, copy the binary to the Intel Xeon Phi coprocessor. For offload applications, no copying is required.

  • To communicate with the Intel Xeon Phi coprocessor cards, you may use any of the following mechanisms:

    • Mount an NFS share. See the NFS Mounting a Host Export topic in the Intel Manycore Platform Software Stack (MPSS) help for details.

    • Use existing SSH tools. Make sure to configure SSH to work in a password-less mode.

3.

Specify and configure your analysis target from the host system

4.

Configure and run an analysis type

  • From the performance analysis tree in the Analysis Type window, choose and configure an analysis type. If you selected an Intel Xeon Phi coprocessor (native) or Intel Xeon Phi coprocessor (host launch) target system in the Analysis Target window, the VTune Amplifier updates the analysis tree in the Analysis Type window to display the analysis types supported for the Intel Xeon Phi coprocessor:

    Note

    For Intel Xeon Phi coprocessor codenamed Knights Corner, the call stack data collection is supported only for the Intel Xeon Phi Coprocessor (native) target type.

  • Click Start to run the analysis.

5.

Open and interpret analysis results

Intel® VTune™ Amplifier generates a data collection result and, by default, opens it in the default viewpoint. Switch between available viewpoints to identify code regions that took most of the CPU time and experienced potentially significant architectural bottlenecks.

See Also