Intel® VTune™ Amplifier XE and Intel® VTune™ Amplifier for Systems Help
Intel® VTune™ Amplifier collects call stack information during User-Mode Sampling and Tracing Collection or Hardware Event-based Sampling Collection with Stacks with stack collection enabled. Use the callstacks report to see how the hot functions are called. This report type focuses on call sequences, beginning from the functions that take most CPU time.
You can use the -column option to filter the callstacks report and focus on the specific metric, for example:
$ amplxe-cl -report -callstacks -r r001ah -column="CPI Rate"
To display a list of columns available for callstacks report, enter: amplxe-cl -report callstacks -r <result_dir> column=?
Example 1: Callstacks Report with Limited Items
The following example generates a callstacks report for the most recent analysis result and limits the number of functions and function stacks to 5 items.
$ amplxe-cl -report callstacks -limit 5
Function Function Stack CPU Time Module Function (Full) Source File Start Address
-------------------- ----------------- -------- --------------------- ------------------------ ----------------- -------------
initialize_2D_buffer 22.746s tachyon_find_hotspots initialize_2D_buffer find_hotspots.cpp 0x4018f0
render_one_pixel 22.746s tachyon_find_hotspots render_one_pixel find_hotspots.cpp 0x401950
draw_trace 0s tachyon_find_hotspots draw_trace(void) find_hotspots.cpp 0x401d70
thread_trace 0s tachyon_find_hotspots thread_trace(thr_parms*) find_hotspots.cpp 0x401ef0
trace_shm 0s tachyon_find_hotspots trace_shm trace_rest.cpp 0x410a20
trace_region 0s tachyon_find_hotspots trace_region trace_rest.cpp 0x410aa0
rt_renderscene 0s tachyon_find_hotspots rt_renderscene(void*) api.cpp 0x402360
tachyon_video 0s tachyon_find_hotspots tachyon_video video.cpp 0x402240
main 0s tachyon_find_hotspots main video.cpp 0x4013e0
__libc_start_main 0s libc.so.6 __libc_start_main libc-start.c 0x21dd0
_start 0s tachyon_find_hotspots _start [Unknown] 0x40149c
grid_intersect 7.282s tachyon_find_hotspots grid_intersect grid.cpp 0x408930
intersect_objects 2.756s tachyon_find_hotspots intersect_objects(ray*) intersect.cpp 0x40a400
shader 0s tachyon_find_hotspots shader(ray*) shade.cpp 0x40eae0
...
Example 2: Callstacks Report with Callstack Grouping
This example generates a callstacks report for the r001lw result that is grouped by function call stacks.
$ amplxe-cl -report callstacks -r r001lw -group-by callstack
Function/Function Stack Wait Time Module Function (Full)
------------------------------- --------- --------------------- -----------------------------------------------------------
draw_task::operator() 98.698s tachyon_analyze_locks draw_task::operator()(tbb::blocked_range<int> const&) const
tbb::interface6::internal 0s tachyon_analyze_locks tbb::interface6::internal
execute<tbb::interface6::internal 0s tachyon_analyze_locks execute::interface6::internal
[TBB parallel_for on draw_task] 0s tachyon_analyze_locks tbb::interface6::internal::execute(void)
[TBB Dispatch Loop] 0s libtbb.so.2 tbb::internal::local_wait_for_all(tbb::task&, tbb::task*)
...