Intel® VTune™ Amplifier XE and Intel® VTune™ Amplifier for Systems Help
Disk Input and Output analysis is a platform-wide analysis that monitors utilization of the disk subsystem, CPU and processor buses.
This analysis type uses the hardware event-based sampling collection and system-wide Ftrace* collection to provide a consistent view of the storage sub-system combined with hardware events and an easy-to-use method to match user-level source code with I/O packets executed by the hardware.
The analysis actively relies on the data produced by the kernel block driver system. In case your platform utilizes a non-standard block driver sub-system (for example, user-space storage drivers), disk metrics will not be available in the analysis type.
Use the Disk Input and Output analysis to identify:
Imbalance between I/O and compute operations (HPC applications)
Long latency of I/O requests (transactional workloads)
Hardware utilization (streaming)
VTune Amplifier uses the following system-wide metrics for the Disk I/O analysis:
I/O Wait system-wide metric (Linux* targets only) shows the time when system cores are idle but there are threads in a context switch caused by I/O access.
I/O Queue Depth metric shows the number of I/O requests submitted to the storage device. Zero requests in a queue means that there are no requests scheduled and disk is not used at all.
I/O Data Transfer metric shows the number of bytes read from or written to the storage.
Page Faults metric shows the number of page faults occurred on a system. This metric is useful when analyzing access to memory mapped files.
CPU Activity metric defines a portion of time the system spent in the following states:
Idle state - the CPU core is idle.
Active state - the CPU core is executing a thread.
I/O Wait (Linux targets only) - the CPU core is idle but there is a thread, blocked by an access to the disk, that could be potentially executed on this core.
PCIe Bandwidth metric represents an amount of data transferred via the PCIe bus per second. This metric is collected only on server platforms based on Intel microarchitecture code name Sandy Bridge EP and later.
Prerequisites:
Run the Intel® VTune™ Amplifier with root privileges. For Disk Input/Output analysis on Linux* targets, the VTune Amplifier automatically sets perf_event_paranoid to 0.
Create a VTune Amplifier project and specify your analysis target (application, process, or system). Note that irrespective of the target type you select, the VTune Amplifier automatically enables the Analyze system-wide target option to collect system-wide metrics for the Disk Input and Output analysis.
To run the Disk Input and Output analysis:
Click the New Analysis button on the Intel® VTune™ Amplifier toolbar.
The New Amplifier Result tab opens with the Analysis Type window active.
From the analysis tree on the left pane, select Platform Analysis > Disk Input and Output.
The analysis configuration pane opens on the right.
This predefined analysis type does not provide additional options (knobs) to configure.
Click the Start button on the right to run the analysis.
To run the Disk Input/Output analysis from the command line, enter:
$ amplxe-cl -collect disk-io -- <target> [target_options]
VTune Amplifier collects the data, generates a rxxxdiskio result, and opens it in the default Disk Input and Output viewpoint that displays statistics on I/O waits (Linux targets only), I/O operations and I/O data transfers distributed over time and correlated with the data on the application execution. Start with the Disk Input and Output Histogram sections of the Summary window. Identify slow I/O operations and switch to the grid view for further analysis.
If you identified imbalance between I/O and compute operations, consider modifying your code to make I/O operations asynchronous.
For I/O requests with long latency, check whether your data can be pre-loaded, written incrementally, or consider upgrading your storage device (to SSD, for example).