This document is based on "Performance Tuning: A Continuing Series -- The iostat Tool", by Barry Saad, from the January/February 1994 issue of AIXTRA: IBM'S MAGAZINE FOR AIX PROFESSIONALS.
This article discusses iostat and how it can identify I/O-subsystem and CPU bottlenecks. iostat works by sampling the kernel's address space and extracting data from various counters that are updated every clock tick (1 clock tick = 10 milliseconds). The results -- covering TTY, CPU, and I/O subsystem activity -- are reported as per-second rates or as absolute values for the specified interval. This document applies to AIX Versions 4.x.
Normally, iostat is issued with both an interval and a count specified, with the report sent to standard output or redirected to a file. The command syntax appears below:
iostat [-t] [-d] [Drives] [Interval [Count]]
The -d flag causes iostat to provide only disk statistics for all drives. The -t flag causes iostat to provide only system-wide TTY and CPU statistics.
NOTE: The -t and -d options are mutually exclusive.
If you specify one or more drives, the output is limited to those drives. Multiple drives can be specified; separate them in the syntax with spaces.
You can specify a time in seconds for the interval between records to be included in the reports. The initial record contains statistics for the time since system boot. Succeeding records contain data for the preceding interval. If no interval is specified, a single record is generated.
If you specify an interval, the count of the number of records to be included in the report can also be specified. If you specify an interval without a count, iostat will continue running until it is killed.
-------------------------------------------------------------------- | | | tty: tin tout avg-cpu: % user % sys % idle % iowait| | 2.2 3.3 0.4 1.3 97.7 0.6 | | | | Disks: % tm_act Kbps tps msps Kb_read Kb_wrtn | | hdisk0 0.4 1.1 0.3 117675 1087266 | | hdisk1 0.3 1.0 0.2 59230 1017734 | | hdisk2 0.0 0.2 0.0 180189 46832 | | cd0 0.0 0.0 0.0 0 0 | | | | tty: tin tout avg-cpu: % user % sys % idle % iowait| | 2.2 3.3 0.4 1.3 97.7 0.6 | | | | Disks: % tm_act Kbps tps msps Kb_read Kb_wrtn | | hdisk0 0.4 1.1 0.3 117675 1087266 | | hdisk1 0.3 1.0 0.2 59230 1017734 | | hdisk2 0.0 0.2 0.0 180189 46832 | | cd0 0.0 0.0 0.0 0 0 | | | -------------------------------------------------------------------- Figure 1. Sample Output from iostat 2 2The following sections explain the output.
The two columns of TTY information (tin and tout) in the iostat output show the number of characters read and written by all TTY devices, including both real and pseudo TTY devices. Real TTY devices are those connected to an asynchronous port. Some "pseudo TTY devices" are shells, telnet sessions, and aixterms.
Generally there are fewer input characters than output characters. For example, assume you run the following:
iostat -t 1 30 cd /usr/sbin ls -l
You will see few input characters and many output characters. On the other hand, applications such as vi result in a smaller difference between the number of input and output characters. Analysts using modems for asynchronous file transfer may notice the number of input characters exceeding the number of output characters. Naturally, this depends on whether the files are being sent or received relative to the measured system.
Since the processing of input and output characters consumes CPU resource, look for a correlation between increased TTY activity and CPU utilization. If such a relationship exists, evaluate ways to improve the performance of the TTY subsystem. Steps that could be taken include changing the application program, modifying TTY port parameters during file transfer, or perhaps upgrading to a faster or more efficient asynchronous communications adapter.
The CPU statistics columns (% user, % sys, % idle, and % iowait) provide a breakdown of CPU usage. This information is also reported in the output of the vmstat command in the columns labeled us, sy, id, and wa.
The % user column shows the percentage of CPU resource spent in user mode. A UNIX process can execute in user or system mode. When in user mode, a process executes within its own code and does not require kernel resources.
The % sys column shows the percentage of CPU resource spent in system mode. This includes CPU resource consumed by kernel processes (kprocs) and others that need access to kernel resources. For example, the reading or writing of a file requires kernel resources to open the file, seek a specific location, and read or write data. A UNIX process accesses kernel resources by issuing system calls.
Typically, the CPU is pacing (the system is CPU bound) if the sum of user and system time exceeds 90 percent of CPU resource on a single-user system or 80 percent on a multi-user system. This condition means that the CPU is the limiting factor in system performance.
The ratio of user to system mode is determined by workload and is more important when tuning an application than when evaluating performance.
A key factor when evaluating CPU performance is the size of the run queue (provided by the vmstat command). In general, as the run queue increases, users will notice degradation (an increase) in response time.
The % idle column shows the percentage of CPU time spent idle, or waiting, without pending local disk I/O. If there are no processes on the run queue, the system dispatches a special kernel process called wait. On most AIX systems, the wait process ID (PID) is 516.
The % iowait column shows the percentage of time the CPU was idle with pending local disk I/O.
The iowait state is different from the idle state in that at least one process is waiting for local disk I/O requests to complete. Unless the process is using asynchronous I/O, an I/O request to disk causes the calling process to block (or sleep) until the request is completed. Once a process's I/O request completes, it is placed on the run queue.
In general, a high iowait percentage indicates the system has a memory shortage or an inefficient I/O subsystem configuration. Understanding the I/O bottleneck and improving the efficiency of the I/O subsystem require more data than iostat can provide. However, typical solutions might include:
On systems running a primary application, high I/O wait percentage may be related to workload. In this case, there may be no way to overcome the problem. On systems with many processes, some will be running while others wait for I/O. In this case, the iowait can be small or zero because running processes "hide" wait time. Although iowait is low, a bottleneck may still limit application performance. To understand the I/O subsystem thoroughly, you need to examine the statistics in the next section.
The disk statistics portion of the iostat output provides a breakdown of I/O usage. This information is useful in determining whether a physical disk is limiting performance.
The system maintains a history of disk activity by default. Note that history is disabled if you see the message:
Disk history since boot not available.
This message displays only in the first output record from iostat.
Disk I/O history should be enabled since the CPU resource used in maintaining it is insignificant. History keeping can be disabled or enabled in SMIT under the following path:
-> System Environments -> Change/Show Characteristics of Operating System -> Continuously maintain DISK I/O history -> true | false
Choose true to enable history keeping or false to disable it.
The Disks: column shows the names of the physical volumes. They are either hdisk or cd followed by a number. (hdisk0 and cd0 refer to the first physical disk drive and the first CD disk drive, respectively.)
The % tm_act column shows the percentage of time the volume was active. This is the primary indicator of a bottleneck.
A drive is active during data transfer and command processing, such as seeking to a new location. The disk-use percentage is directly proportional to resource contention and inversely proportional to performance. As disk use increases, performance decreases and response time increases. In general, when a disk's use exceeds 70 percent, processes are waiting longer than necessary for I/O to complete because most UNIX processes block (or sleep) while waiting for their I/O requests to complete.
Kbps shows the amount of data read from and written to the drive in KBs per second. This is the sum of Kb_read plus Kb_wrtn, divided by the number of seconds in the reporting interval.
tps reports the number of transfers per second. A transfer is an I/O request at the device driver level.
Kb_read reports the total data (in KBs) read from the physical volume during the measured interval.
Kb_wrtn shows the amount of data (in KBs) written to the physical volume during the measured interval.
Taken alone, there is no unacceptable value for any of the preceding fields because statistics are too closely related to application characteristics, system configuration, and types of physical disk drives and adapters. Therefore, when evaluating data, you must look for patterns and relationships. The most common relationship is between disk utilization and data transfer rate.
To draw any valid conclusions from this data, you must understand the application's disk data access patterns -- sequential, random, or a combination -- and the type of physical disk drives and adapters on the system.
For example, if an application reads and writes sequentially, you should expect a high disk-transfer rate when you have a high disk-busy rate. (NOTE: Kb_read and Kb_wrtn can confirm an understanding of an application's read and write behavior, but they provide no information on the data access patterns).
Generally you do not need to be concerned about a high disk-busy rate as long as the disk-transfer rate is also high. However, if you get a high disk-busy rate and a low data-transfer rate, you may have a fragmented logical volume, file system, or individual file.
What is a high data-transfer rate? That depends on the disk drive and the effective data-transfer rate for that drive. You should expect numbers between the effective sequential and effective random disk-transfer rates. Below is a chart of effective transfer rates for several common SCSI-1 and SCSI-2 disk drives.
TYPE OF ACCESS | 400 MB DRIVE | 670 MB DRIVE | 857 MB DRIVE |
Read-Sequential | 1589 | 1525 | 2142 |
Read-Random | 241 | 172 | 262 |
Write-Sequential | 1185 | 1108 | 1588 |
Write-Random | 327 | 275 | 367 |
TYPE OF ACCESS | 1.2 GB DRIVE | 1.37 GB DRIVE | 1.2 GB S-2 DRIVE | 1.37 GB S-2 DRIVE |
Read-Sequential | 2169 | 2667 | 2180 | 3123 |
Read-Random | 292 | 299 | 385 | 288 |
Write-Sequential | 1464 | 2189 | 2156 | 2357 |
Write-Random | 362 | 491 | 405 | 549 |
The transfer rates were determined during performance testing and give more accurate expectations of disk performance than the media-transfer rate, which reflects the hardware capability and does not account for operating system and application overhead.
Another use of the data is to answer the question: "Do I need another SCSI adapter?" If you've ever been asked this question, you probably provided a generic answer or just plain guessed.
You can use data captured by iostat to answer the question accurately by tracking transfer rates, finding the maximum data transfer rate for each disk. Assume that the maximum rate occurs simultaneously for all drives (the worst case). For maximum aggregate performance, the measured transfer rates for drives attached to a given adapter must be below the effective SCSI adapter throughput rating.
For planning purposes, you should use 70 percent of the adapter's rated throughput (for example, 2.8 MB per second for a SCSI-1 adapter). This percentage should provide a sufficient buffer for occasional peak rates that may occur. When adding a drive, you must assume the data-transfer rate. At least you will have the collected data and the effective transfer rates to use as a basis.
Keep in mind that the SCSI adapter may be saturated if the data transfer rates over multiple intervals approach the effective SCSI adapter throughput rating. In that case, the preceding analysis is invalid.
The primary purpose of the iostat tool is to detect I/O bottlenecks by monitoring the disk utilization (%tm_act field). iostat can also be used to identify CPU problems, assist in capacity planning, and provide insight into solving I/O problems. Armed with both vmstat and iostat, you can capture the data required to identify performance problems related to CPU, memory, and I/O subsystems.