Intel® Advisor provides two tools to help ensure your Fortran, C and C++ applications realize full performance potential on modern processors, such as Intel® Xeon Phi™ processors:
  • Vectorization Advisor is a vectorization optimization tool that lets you identify loops that will benefit most from vectorization, identify what is blocking effective vectorization, explore the benefit of alternative data reorganizations, and increase the confidence that vectorization is safe.

  • Threading Advisor is a threading design and prototyping tool that lets you analyze, design, tune, and check threading design options without disrupting your normal development.

Intel® Advisor is available as part of the:
  • Intel® Parallel Studio XE Professional Edition

  • Intel® Parallel Studio XE Cluster Edition

If you do not already have access to the Intel® Advisor, download an evaluation copy from http://software.intel.com/en-us/articles/intel-software-evaluation-center/. (Use Version 15.0 or higher of an Intel compiler to get more benefit from the Vectorization Advisor Survey Report; download an evaluation copy from http://software.intel.com/en-us/articles/intel-software-evaluation-center/.)

Vectorization Advisor

Key Features

Key Vectorization Advisor features include the following:
  • Survey Report - Offers integrated compiler report data and performance data all in one place. Use the Survey Report to help identify:
    • Where vectorization will pay off the most

    • If vectorized loops are providing benefit, and if not, why not

    • Un-vectorized and under-vectorized loops, and the estimated expected performance gain of vectorization or better vectorization

    • How data accessed by vectorized loops is organized and the estimated expected performance gain of reorganization

    The Survey Report also provides:
    • Friendly, code-specific recommendations for how to fix vectorization issues

    • Quick visibility into source code and assembly code

  • Trip Counts analysis - Dynamically identifies the number of times loops are invoked and execute (sometimes called call count/loop count and iteration count respectively). Use this added information in the Survey Report to make better decisions about your vectorization strategy for particular loops, as well as optimize already-parallel loops.

  • Dependencies Report - For safety purposes, the compiler is often conservative when assuming data dependencies. Use a Dependencies-focused Refinement Report to check for real data dependencies in loops the compiler did not vectorize because of assumed dependencies. If real dependencies are detected, the analysis can provide additional details to help resolve the dependencies. Your objective: Identify and better characterize real data dependencies that could make forced vectorization unsafe.

  • Memory Access Patterns (MAP) Report - Use a MAP-focused Refinement Report to check for various memory issues, such as non-contiguous memory accesses and unit stride vs. non-unit stride accesses. Your objective: Eliminate issues that could lead to significant vector code execution slowdown or block automatic vectorization by the compiler.

Prerequisites

To build applications that produce the most accurate and complete Vectorization Advisor analysis results, build an optimized binary of your application in release mode using these settings:

To Do This

Optimal C/C++ Settings

Request full debug information (compiler and linker).

-g

Request moderate optimization.

-O2 or higher

Produce compiler diagnostics (necessary for version 15.0 of the Intel compiler; unnecessary for version 16.0 and higher).

-qopt-report=5

Enable vectorization

-vec

Enable SIMD directives

-simd

Enable generation of multi-threaded code based on OpenMP* directives.

-qopenmp

To Do This

Optimal Fortran Settings

Request full debug information (compiler and linker).

-g

Request moderate optimization.

-O2 or higher

Produce compiler diagnostics (necessary for version 15.0 of the Intel compiler; unnecessary for version 16.0 and higher).

-qopt-report=5

Enable vectorization

-vec

Enable SIMD directives

-simd

Enable generation of multi-threaded code based on OpenMP* directives.

-qopenmp

In addition:
  • Verify your application runs before trying to analyze it with the Intel Advisor.

  • Make sure you run the Intel Advisor in the same environment as your application.

Set Up Environment

Do one of the following to set up your environment.
  • Run one of the following source commands:
    • For csh/tcsh users: source <advisor-install-dir>/advixe-vars.csh

    • For bash users: source <advisor-install-dir>/advixe-vars.sh

    The default installation path, <advisor-install-dir>, is below:
    • /opt/intel/ for root users

    • $HOME/intel/ for non-root users

  • Add <advisor-install-dir>/bin32 or <advisor-install-dir>/bin64 to your path.

  • Run the <parallel-studio-install-dir>/psxevars.csh or <parallel-studio-install-dir>/psxevars.sh command. The default installation path, <parallel-studio-install-dir>, is below:
    • /opt/intel/ for root users

    • $HOME/intel/ for non-root users

Get Started

Follow these steps (white blocks are optional) to get started using the Vectorization Advisor in the Intel Advisor.
Vectorization Advisor workflow: Dig Deeper

Launch the Intel Advisor

Run the advixe-gui command.

Manage Project

  1. Choose File > New > Project… (or click New Project… in the Welcome page) to open the Create a Project dialog box.

  2. Supply a name and location for your project, then click the Create Project button to open the Project Properties dialog box.

  3. On the left side of the Analysis Target tab, ensure the Survey Hotspots Analysis type is selected.

  4. Set the appropriate parameters. (Setting the binary/symbol search and source search directories is optional for the Vectorization Advisor.)

  5. Click the OK button to close the Project Properties dialog box.

Tip:
  • If you plan to run other vectorization Analysis Types, set parameters for them now, if possible.

  • If possible, use the Inherit settings from Survey Hotspots Analysis Type checkbox for other Analysis Types.

  • The Survey Trip Counts Analysis type has similar parameters to the Survey Hotspots Analysis type.

  • The Dependencies Analysis and Memory Access Patterns Analysis types consume more resources than the Survey Hotspots Analysis type. If these Refinement analyses take too long, consider decreasing the workload.

  • Select Track stack variables in the Dependencies Analysis type to detect all possible dependencies.

  • When necessary, click the tab at the top of the Workflow pane to switch between the Vectorization Workflow and Threading Workflow.

Run Survey Analysis

Under Survey Target in the Vectorization Workflow, click the Intel Advisor control: Run analysis control to collect Survey data while your application executes.

After the Intel Advisor collects the data, it displays a Survey Report similar to the following:


Intel Advisor: Survey Report
There are many controls available to help you focus on the data most important to you, including the following:

1

Click the various Filter controls (buttons and drop-down lists) to temporarily limit displayed data based on your criteria.

2

Click the Search control to search for specific data.

3

Click the Expand/Collapse controls to show/hide sets of columns.

4

Click a loop data row in the top of the Survey Report to display more data specific to that loop in the bottom of the Survey Report. Double-click a loop data row to display a Survey Source window.

5

Click a checkbox to mark a loop for deeper analysis.

6

If present, click the Intel Advisor control: Recommendations control to display code-specific how-can-I-fix-this-issue? information in the Recommendations pane.

7

If present, click the Intel Advisor control: Compiler diagnostic details control to display code-specific how-can-I-fix-this-issue? information in the Compiler Diagnostic Details pane.

8

Click the control to show/hide the Workflow pane.

9

Click the control to save a result snapshot you can view any time.

Run Trip Counts Analysis

This step is optional.

Before running a Trip Counts analysis, make sure you set the appropriate Project Properties for the Survey Trip Counts Analysis type. (Use the same application, but a smaller input data set if possible.)

Under Find Trip Counts in the Vectorization Workflow, click the Intel Advisor control: Run analysis control to collect Trip Counts data while your application executes.

After the Intel Advisor collects the data, it adds a Trip Counts column set to the Survey Report. Median data is shown by default. Min, Max, Call Count, and Iteration Duration data are shown when the column set is expanded.

Tip:
Use Trip Counts data to:
  • Detect loops with too-small trip counts and trip counts that are not a multiple of vector length.

  • Analyze parallelism granularity more deeply.

Investigate Loops

The Survey Report provides a wealth of information, including:
  • Key information from the Intel compiler vectorization and optimization reports

  • Source and assembly code for the data row selected at the top of the report

  • Code-specific how-can-I-fix-this-issue?  Recommendations and Compiler Diagnostic Details for the data row selected at the top of the report

Tip:
  • Pay particular attention to the hottest loops in terms of Self Time and Total Time. Optimizing these loops provides the most benefit. Innermost loops and loops near innermost loops are often good candidates for vectorization. Outermost loops with significant Total Time are often good candidates for parallelization with threads.

  • Check if the best possible Vector ISA is used by your application, or if there are heavy operations required for vectorization that might be a problem, such as masking or gather operations.

  • Compare the modeled Gain Estimate with the gain expected from the Vector Instruction Set to ensure you are likely to get the optimal speed-up. For example: AVX2 processing of 32-bit integers should give an 8x performance gain. If the Gain Estimate is much lower than the expected gain for the Vector ISA, consider optimizing an already vectorized loop by eliminating heavy vector operations, aligning data, or rewriting the loop to remove control-flow clauses.

  • A vectorized loop may not achieve the best performance when the compiler peels a source loop into peeled and remainder loops. If the peeled or remainder loop takes a significant portion of loop execution time, aligning data or changing the number of loop iterations may help.

After you investigate the data in the Survey Report, you have several choices:

If Your Investigation Shows This

Do This

All loops are vectorizing properly and performance is satisfactory.

You are done! Congratulations!

One or more loops is not vectorizing properly and performance is unsatisfactory.

  1. Improve application performance using Recommendations and Compiler Diagnostic Details information to guide your efforts.

  2. Rebuild your modified code.

  3. Run another Survey analysis to verify all loops are vectorizing properly and performance is satisfactory.

You need more information (because, for example, there is an assumed dependency compiler diagnostic, or there are expensive memory instructions like gathers, inserts, or shuffles).

Continue your investigation by:
  1. Marking one or more loops for deeper analysis

  2. Defining the appropriate Project Properties for the Refinement analysis you plan to run

  3. Running one or more Refinement analyses

If this further investigation shows there is room for improvement:
  1. Make the improvements.

  2. Rebuild your modified code.

  3. Run another Survey analysis to verify your application still runs correctly and all test cases pass, all loops are vectorizing properly, and performance is satisfactory.

Otherwise, you are done!

Run Dependencies Analysis

This step is optional.

Before running a Dependencies analysis, make sure you:
  • Set the appropriate Project Properties for the Dependencies Analysis type. (Use the same application, but a smaller input data set if possible. And select Track stack variables to detect all possible dependencies.)

  • Mark one or more un-vectorized loops for deeper analysis in the Survey Report.

Under Check Dependences in the Vectorization Workflow, click the Intel Advisor control: Run analysis control to collect Dependencies data while your application executes.

After the Intel Advisor collects the data, it displays a Dependencies-focused Refinement Report similar to the following:


Intel Advisor: Dependencies Report
There are many controls available to help you focus on the data most important to you, including the following:

1

To display more information in the Dependencies Report about a loop you selected for deeper analysis: Click the associated data row.

To choose a loop of interest to display in the Dependencies Source window: Double-click the associated data row.

2

To display instruction addresses and code snippets for associated code locations in the Code Locations pane: Click a data row.

To choose a problem of interest to display in the Dependencies Source window: Right click a data row, then choose View Source.

To open your default editor in another tab/window: Right-click a data row, then choose Edit Source to open an editor tab.

3

To choose a code location of interest to display in the Dependencies Source window:
  • Click a data row.

  • Right-click a data row, then choose View Source.

To open your default editor in another tab/window: Right-click a data row, then choose Edit Source to open an editor tab.

4

Use the Filter pane to:
  • Temporarily limit the items displayed in the Problems and Messages pane by clicking filter criteria in one or more filter categories.

  • Deselect filter criteria in one filter category, or deselect filter criteria in all filter categories.

  • Sort all filter criteria by name in ascending alphabetical order or by count in descending numerical order. (You cannot change the order in which filter categories are presented.

5

To populate these columns and the Memory Access Patterns Report with data, run a Memory Access Patterns analysis.

6

Click the control to show/hide the Workflow pane.

Depending on what the Dependencies Report shows, do one or more of the following:
  • If there is no real dependency in the loop for the given workload, use one of the following to tell the compiler it is safe to vectorize:
    • #pragma simd ICL/ICC/ICPC directive, or #pragma omp simd OpenMP* 4.0 standard, or !DIR$ SIMD or !$OMP SIMD IFORT directive to ignore all dependencies in the loop

    • #pragma ivdep ICL/ICC/ICPC directive or !DIR$ IVDEP IFORT directive to ignore only vector dependencies (which is safest, but less powerful in certain cases)

    • restrict keyword

  • If there is an anti-dependency (often called a Write after read dependency or WAR), enable vectorization using the #pragma simd vectorlength(k) ICL/ICC/ICPC directive or !DIR$ SIMD VECTORLENGTH(k) IFORT directive, where k is smaller than the distance between dependent items in anti-dependency:

  • If there is a reduction in the loop, enable vectorization using the #pragma omp simd reduction(operator:list) ICL/ICC/ICPC directive or !$OMP SIMD REDUCTION(operator:list) IFORT directive.

  • Rewrite code to remove dependencies.

After you finish making improvements:
  1. Run a MAP analysis if desired.

  2. Rebuild your modified code.

  3. Run another Survey analysis to verify your application still runs correctly and all test cases pass, all loops are vectorizing properly, and performance is satisfactory.

Run a Memory Access Patterns (MAP) Analysis

This step is optional.

Before running a MAP analysis, make sure you:
  • Set the appropriate Project Properties for the Memory Access Patterns Analysis type. (Use the same application, but a smaller input data set if possible.)

  • Mark one or more loops for deeper analysis in the Survey Report.

Under Check Memory Access Patterns in the Vectorization Workflow, click the Intel Advisor control: Run analysis control to collect MAP data while your application executes.

After the Intel Advisor collects the data, it displays a MAP-focused Refinement Report similar to the following:

Intel Advisor: Memory Access Patterns (MAP) Report

After you finish making improvements:
  1. Rebuild your modified code.

  2. Run another Survey analysis to verify your application still runs correctly and all test cases pass, all loops are vectorizing properly, and performance is satisfactory.

Tip:

Double-click source lines at the bottom of the report to get a more detailed source and assembly access pattern report where stride information is provided at the instruction level.

Troubleshooting/FAQ

Also, see https://software.intel.com/en-us/intel-advisor-xe-support/faq.

To Do This

Optimal C/C++ Settings

Retrieve better compiler diagnostics.

Disable Interprocedural Optimization (IPO): -no-ipo

Address any issues with source line matching.

Do one of the following:
  • Raise the debug level: -debug: inline-debug-info

  • Temporarily disable inlining: -ip-no-inlining

Looks like your application doesn't use fresh vector instructions? Experiment with generating code for different instructions.

-xHost, -xSSE4.2, -xAVX, -axAVX, -xCORE-AVX2, -axCORE-AVX2, -xCOMMON-AVX512, -xMIC-AVX512, -axMIC-AVX512, -xCORE-AVX512

To Do This

Optimal Fortran Settings

Retrieve better compiler diagnostics.

Disable Interprocedural Optimization (IPO): -no-ipo

Address any issues with source line matching.

Do one of the following:
  • Raise the debug level: -debug: inline-debug-info

  • Temporarily disable inlining: -ip-no-inlining

Looks like your application doesn't use fresh vector instructions? Experiment with generating code for different instructions.

-xHost, -xSSE4.2, -xAVX, -axAVX, -xCORE-AVX2, -axCORE-AVX2, -xCOMMON-AVX512, -xMIC-AVX512, -axMIC-AVX512, -xCORE-AVX512

Threading Advisor

Key Features

Key Threading Advisor features include the following:
  • Survey Report - Shows the loops and functions where your application spends the most time. Use this information to discover candidates for parallelization with threads.

  • Trip Counts analysis - Shows the minimum, maximum, and median number of times a loop body will execute, as well as the number of times a loop is invoked. Use this information to make better decisions about your threading strategy for particular loops.

  • Annotations - Insert to mark places in your application that are good candidates for later replacement with parallel framework code that enables threading parallel execution. Annotations are subroutine calls or macros (depending on the programming language) that can be processed by your current compiler but do not change the computations of your application.

  • Suitability Report - Predicts the maximum speed-up of your application based on the inserted annotations and a variety of what-if modeling parameters with which you can experiment. Use this information to choose the best candidates for parallelization with threads.

  • Dependencies Report - Predicts parallel data sharing problems based on the inserted annotations. Use this information to fix the data sharing problems if the predicted maximum speed-up benefit justifies the effort.

Prerequisites

To build applications that produce the most accurate and complete Threading Advisor analysis results, build an optimized binary of your application in release mode using these settings:

To Do This

Optimal C/C++ Settings

Search additional directory related to Intel Advisor annotation definitions.

-I${ADVISOR_XE_[product_year]_DIR}/include

Request full debug information (compiler and linker).

-g

Request moderate optimization.

-O2 or higher

Search for unresolved references in multithreaded, dynamically linked libraries.

-Bdynamic

Enable dynamic loading.

-ldl

To Do This

Optimal Fortran Settings

Search additional directory related to Intel Advisor annotation definitions.

  • -I${ADVISOR_XE_[product_year]_DIR}/include/ia32 or -I${ADVISOR_XE_[product_year]_DIR}/include/intel64

  • -L${ADVISOR_XE_[product_year]_DIR}/lib32 or -L${ADVISOR_XE_[product_year]_DIR}/lib64

  • -ladvisor

Request full debug information (compiler and linker).

-g

Request moderate optimization.

-O2 or higher

Search for unresolved references in multithreaded, dynamically linked libraries.

-shared-intel

Enable dynamic loading.

-ldl

In addition:
  • Verify your application runs before trying to analyze it with the Intel Advisor.

  • Make sure you run the Intel Advisor in the same environment as your application.

Set Up Environment

Do one of the following to set up your environment.
  • Run one of the following source commands:
    • For csh/tcsh users: source <advisor-install-dir>/advixe-vars.csh

    • For bash users: source <advisor-install-dir>/advixe-vars.sh

    The default installation path, <advisor-install-dir>, is below:
    • /opt/intel/ for root users

    • $HOME/intel/ for non-root users

  • Add <advisor-install-dir>/bin32 or <advisor-install-dir>/bin64 to your path.

  • Run the <parallel-studio-install-dir>/psxevars.csh or <parallel-studio-install-dir>/psxevars.sh command. The default installation path, <parallel-studio-install-dir>, is below:
    • /opt/intel/ for root users

    • $HOME/intel/ for non-root users

Get Started

Follow these steps (white blocks are optional) to get started using the Threading Advisor in the Intel Advisor.
Threading Advisor workflow

Launch the Intel Advisor

Run the advixe-gui command.

Manage Project

  1. Choose File > New > Project… (or click New Project… in the Welcome page) to open the Create a Project dialog box.

  2. Supply a name and location for your project, then click the Create Project button to open the Project Properties dialog box.

  3. On the left side of the Analysis Target tab, ensure the Survey Hotspots Analysis type is selected.

  4. Set the appropriate parameters, and binary/symbol search and source search directories.

  5. Click the OK button to close the Project Properties dialog box.

Tip:
  • If you plan to run other threading Analysis Types, set parameters for them now, if possible.

  • If possible, use the Inherit settings from Survey Hotspots Analysis Type checkbox for other Analysis Types.

  • The Survey Trip Counts Analysis type has similar parameters to the Survey Hotspots Analysis type.

  • The Dependencies Analysis type consume more resources than the Survey Hotspots Analysis type. If Dependencies analysis take too long, consider decreasing the workload.

  • When necessary, click the tab at the top of the Workflow pane to switch between the Vectorization Workflow and Threading Workflow.

Run Survey Analysis

Under Survey Target in the Threading Workflow, click the Intel Advisor control: Run analysis control to collect Survey data while your application executes. Use the resulting information to discover candidates for parallelization with threads.

Run Trip Counts Analysis

This step is optional.

Before running a Trip Counts analysis, make sure you set the appropriate Project Properties for the Survey Trip Counts Analysis type.

Under Find Trip Counts in the Threading Workflow, click the Intel Advisor control: Run analysis control to collect Trip Counts data while your application executes. Use the resulting information to make better decisions about your threading strategy for particular loops.

Investigate Loops

Pay particular attention to the hottest loops in terms of Self Time and Total Time. Optimizing these loops provides the most benefit. Outermost loops with significant Total Time are often good candidates for parallelization with threads. Innermost loops and loops near innermost loops are often good candidates for vectorization.

Annotate Sources

Insert annotations to mark places in parts of your application that are good candidates for later replacement with parallel framework code that enables parallel execution. After inserting annotations, rebuild your application in release mode.

The main types of Intel Advisor annotations mark the location of:
  • A parallel site. A parallel site is a region of code that contains one or more tasks that may execute in one or more parallel threads to distribute work. An effective parallel site typically contains a hotspot that consumes application execution time. To distribute these frequently executed instructions to different tasks that can run at the same time, the best parallel site is not usually located at the hotspot, but higher in the call tree.

  • One or more parallel tasks within a parallel site. A task is a portion of time-consuming code with data that can be executed in one or more parallel threads to distribute work.

  • Locking synchronization, where mutual exclusion of data access must occur in the parallel application.

Intel Advisor provides example annotated source code for you (accessible in the Assistance tab of the Survey Report and in the Survey Source windows) that you can copy directly into your editor:

Annotation Code Snippet

Purpose

Iteration Loop, Single Task

Create a simple loop structure, where the task code includes the entire loop body. This common task structure is useful when only a single task is needed within a parallel site.

Loop, One or More Tasks

Create loops where the task code does not include all of the loop body, or complex loops or code that requires specific task begin-end boundaries, including multiple task end annotations. This structure is also useful when multiple tasks are needed within a parallel site.

Function, One or More Tasks

Create code that calls multiple tasks within a parallel site.

Pause/Resume Collection

Temporarily pause data collection and later resume it, so you can skip uninteresting parts of application execution to minimize collected data and speed up analysis of large applications. Add these annotations outside a parallel site.

Build Settings

Set build (compiler and linker) settings specific to the language in use.

Tip:

Choosing where to add task annotations may require some experimentation. If your parallel site has nested loops and the computation time used by the innermost loop is small, consider adding task annotations around the next outermost loop.

Run Suitability Analysis

Before running a Suitability analysis, make sure you set the appropriate Project Properties for the Suitability Analysis type.

Under Check Suitability in the Threading Workflow, click the Intel Advisor control: Run analysis control to collect Suitability data while your application executes.

The Suitability Report predicts maximum speed-up data based on the inserted annotations and what-if modeling parameters with which you can experiment, such as:
  • Different hardware configurations and parallel frameworks

  • Different trip counts and instance durations

  • Any plans to address parallel overhead, lock contention, or task chunking when you implement your parallel framework code

Use the resulting information to choose the best candidates for parallelization with threads.

Run Dependencies Analysis

Before running a Dependencies analysis, make sure you set the appropriate Project Properties for the Dependencies Analysis type. (Use the same application, but a smaller input data set if possible.)

Under Check Dependencies in the Threading Workflow, click the Intel Advisor control: Run analysis control to collect Dependencies data while your application executes. Use the resulting information to fix the data sharing problems if the predicted maximum speed-up benefit justifies the effort.

Improve App Performance

This step is optional.

If you decide the predicted maximum speed-up benefit is worth the effort to add threading parallelism to your application,
  1. Complete developer/architect design and code reviews about the proposed parallel changes.

  2. Choose one parallel programming framework (threading model) for your application, such as Intel® Threading Building Blocks (Intel® TBB), OpenMP*, Intel® Cilk™ Plus, or some other parallel framework.

  3. Add the parallel framework to your build environment.

  4. Add parallel framework code to synchronize access to the shared data resources, such as Intel TBB or OpenMP* locks or Intel Cilk Plus reducers.

  5. Add parallel framework code to create parallel tasks.

As you add the appropriate parallel code from the chosen parallel framework during steps 4 and 5, you can keep, comment out, or replace the Intel Advisor annotations.

Intel Advisor

Command Line

You can use the Intel Advisor command line interface, advixe-cl, to run analyses and reports. This makes it possible to automate many tasks as well as analyze an application running on remote hosts. You can then view results using the Intel Advisor GUI or command line reports.

Below are command line examples of typical Intel Advisor tasks.

To Do This

Use This Command Line Model

View a full list of command line options.

(Applies to Vectorization Advisor & Threading Advisor.)

advixe-cl -help

Note:

You can also check the Intel AdvisorHelp document.

Run a Survey analysis.

(Applies to Vectorization Advisor & Threading Advisor.)

advixe-cl -collect survey –project-dir ./myAdvisorProj -- myTargetApplication

Run a Trip Counts analysis.

(Applies to Vectorization Advisor & Threading Advisor.)

advixe-cl -collect tripcounts –project-dir ./myAdvisorProj -- myTargetApplication

Print a Survey Report to identify loop IDs for Refinement analyses.

(Applies to Vectorization Advisor.)

advixe-cl -report survey –project-dir ./myAdvisorProj

Run a Refinement analysis.

(Applies to Vectorization Advisor.)

advixe-cl -collect [dependencies | map] -mark-up-list=[loopID],[loopID] –project-dir ./myAdvisorProj -- myTargetApplication

Run a Dependencies analysis.

(Applies to Threading Advisor.)

advixe-cl -collect dependencies -project-dir ./myAdvisorProj  -- myTargetApplicaton

Report a top-down functions list instead of a loop list.

(Applies to Vectorization Advisor & Threading Advisor.)

advixe-cl -report survey -top-down -display-callstack

Report all compiler opt-report and vec-report metrics.

(Applies to Vectorization Advisor.)

advixe-cl -report survey -show-all-columns

Report the top five self-time hotspots that were not vectorized because of a not inner loop msg id.

(Applies to Vectorization Advisor.)

advixe-cl -report survey -limit 5 -filter "Vectorization Message(s)"="loop was not vectorized: not inner loop"

Tip:

Click the appropriate Intel Advisor control: Get command line control in the Workflow to generate the corresponding collection command line.

MPI

You can perform an MPI analysis only through the Intel Advisor command line interface; however, there are several ways to view an Intel Advisor result:
  • If you have an Intel Advisor GUI in your cluster environment, open a result in the GUI.

  • If you do not have an Intel Advisor GUI on your cluster node, copy the result directory to another machine with the Intel Advisor GUI and open the result there.

  • Use the Intel Advisor command line reports to browse results on a cluster node.

Use mpirun, mpiexec, or your preferred MPI batch job manager with the advixe-cl command to start an analysis. You may also use the -gtool option of mpirun. See the Intel® MPI Library Reference Manual (available in the Intel® Software Documentation Library) for more information.

Below are command line examples of typical Intel Advisor MPI tasks.

To Do This

Use This Command Line Model

Run 10 MPI ranks (processes), and start an Intel Advisor analysis on each rank.

$ mpirun -n 10 advixe-cl -collect survey --project-dir ./my_proj ./your_app

Intel Advisor creates a number of result directories in the current directory, named as rank.0, rank.1, ... rank.n, where n is the MPI process rank.

Intel Advisor does not combine results from different ranks, so you must explore each rank result independently.

Run 10 MPI ranks, and start an Intel Advisor analysis only on rank #1.

$ mpirun -n 1 advixe-cl -collect survey --project-dir ./my_proj ./your_app : -np 9 ./your_app

Training and Documentation

Note:
The default installation path, <advisor-install-dir>, is below:
  • /opt/intel/ for root users

  • $HOME/intel/ for non-root users

Document/Resource

Description

Online Training

Online training is an excellent resource for novice, intermediate, and advanced users. It includes links to videos, guides, featured topics, event recaps and archived webinars, upcoming events and webinars, and more.

Intel Advisor Release Notes

Contain up-to-date information about the Intel Advisor, including a description, technical support, and known limitations. This document also contains system requirements, installation instructions, and instructions for setting up the command-line environment.

This document is installed at <advisor-install-dir>/documentation/<locale>/<release_notes>.pdf.

Check Intel Advisor Release Notes online for updates.

Intel Advisor Samples, ReadMe's, and Tutorials

Sample applications can help you learn to use the Intel Advisor. Sample applications are installed as individual compressed files under <advisor-install-dir>/samples/en/. After you copy a sample application compressed file to a writable directory, use a suitable tool to extract the contents. Extracted contents include a short README that describes how to build the sample and fix issues.

Vectorization Advisor tutorials show you how to use C++ sample applications to:

  • Identify loops that will benefit most from vectorization.

  • Identify what is blocking effective vectorization.

  • Increase the confidence that vectorization is safe.

  • Explore the benefit of alternative data reorganizations.

A list of available tutorials is installed at <advisor-install-dir>/documentation/<locale>/tutorials/index.htm.

Check Samples, ReadMe's, and Tutorials online for updates or Tutorials online for updates.

Intel Advisor Help

The Help is the primary documentation for the Intel Advisor. It is also accessible from the product Help menu.

This document is installed at <advisor-install-dir>/documentation/<locale>/help/index.htm.

Check Intel Advisor Help online for updates.

Note:

If successive calls to Intel AdvisorHelp result in multiple browser windows appearing instead of new tabs in the existing browser window, try changing your default browser.

More Local Resources

One of the key Vectorization Advisor features is a Survey Report that offers integrated compiler reports and performance data all in one place, including GUI-embedded advice on how to fix vectorization issues specific to your code.

To help you quickly locate information that augments that GUI-embedded advice, the Intel Advisor provides Intel compiler mini-guides:

You can also find complete C++ Recommendations, Fortran Recommendations, and C++/Fortran Compiler Diagnostic Details advice libraries in the same location as the mini-guides. Each issue and recommendation in these HTML files is collapsible/expandable.

These documents are installed below <advisor-install-dir>/documentation/<locale>/advice/.

Web Resources

Intel Advisor

Vectorization Advisor Glossary

Vectorization Resources for Intel® Advisor Users

Intel® Learning Lab (white papers, articles and more)

Intel® Software Documentation Library