Intel® VTune™ Amplifier XE and Intel® VTune™ Amplifier for Systems Help

Callstacks Report

Intel® VTune™ Amplifier collects call stack information during User-Mode Sampling and Tracing Collection or Hardware Event-based Sampling Collection with Stacks with stack collection enabled. Use the callstacks report to see how the hot functions are called. This report type focuses on call sequences, beginning from the functions that take most CPU time.

You can use the -column option to filter the callstacks report and focus on the specific metric, for example:

$ amplxe-cl -report -callstacks -r r001ah -column="CPI Rate"

Note

To display a list of columns available for callstacks report, enter: amplxe-cl -report callstacks -r <result_dir> column=?

Examples

Example 1: Callstacks Report with Limited Items

The following example generates a callstacks report for the most recent analysis result and limits the number of functions and function stacks to 5 items.

$ amplxe-cl -report callstacks -limit 5

Function              Function Stack     CPU Time  Module                 Function (Full)           Source File        Start Address
--------------------  -----------------  --------  ---------------------  ------------------------  -----------------  -------------
initialize_2D_buffer                      22.746s  tachyon_find_hotspots  initialize_2D_buffer      find_hotspots.cpp       0x4018f0
                      render_one_pixel    22.746s  tachyon_find_hotspots  render_one_pixel          find_hotspots.cpp       0x401950
                      draw_trace               0s  tachyon_find_hotspots  draw_trace(void)          find_hotspots.cpp       0x401d70
                      thread_trace             0s  tachyon_find_hotspots  thread_trace(thr_parms*)  find_hotspots.cpp       0x401ef0
                      trace_shm                0s  tachyon_find_hotspots  trace_shm                    trace_rest.cpp       0x410a20
                      trace_region             0s  tachyon_find_hotspots  trace_region                 trace_rest.cpp       0x410aa0
                      rt_renderscene           0s  tachyon_find_hotspots  rt_renderscene(void*)               api.cpp       0x402360
                      tachyon_video            0s  tachyon_find_hotspots  tachyon_video                     video.cpp       0x402240
                      main                     0s  tachyon_find_hotspots  main                              video.cpp       0x4013e0
                      __libc_start_main        0s  libc.so.6              __libc_start_main              libc-start.c        0x21dd0
                      _start                   0s  tachyon_find_hotspots  _start                            [Unknown]       0x40149c
                                                                                                                                                                
grid_intersect                             7.282s  tachyon_find_hotspots  grid_intersect                     grid.cpp       0x408930
                      intersect_objects    2.756s  tachyon_find_hotspots  intersect_objects(ray*)       intersect.cpp       0x40a400
                      shader                   0s  tachyon_find_hotspots  shader(ray*)                      shade.cpp       0x40eae0
...

Example 2: Callstacks Report with Callstack Grouping

This example generates a callstacks report for the r001lw result that is grouped by function call stacks.

$ amplxe-cl -report callstacks -r r001lw -group-by callstack

Function/Function Stack          Wait Time  Module                 Function (Full)                                                                                                                                            
-------------------------------  ---------  ---------------------  -----------------------------------------------------------
draw_task::operator()              98.698s  tachyon_analyze_locks  draw_task::operator()(tbb::blocked_range<int> const&) const                                                                                               
tbb::interface6::internal               0s  tachyon_analyze_locks  tbb::interface6::internal                      
execute<tbb::interface6::internal       0s  tachyon_analyze_locks  execute::interface6::internal                        
[TBB parallel_for on draw_task]         0s  tachyon_analyze_locks  tbb::interface6::internal::execute(void)                                            
[TBB Dispatch Loop]                     0s  libtbb.so.2            tbb::internal::local_wait_for_all(tbb::task&, tbb::task*)                                        
...

See Also