Performance tuning is highly system- and application-dependent, however, there are some general tuning methods that will work on almost any AIX system.
This appendix contains actual test cases from AIX users. If you have a similar scenario in your own environment, use the information in these test cases to help you. Each case describes the type of system and the problem that was encountered. It goes on to explain how to test for the particular performance problem and how to resolve the problem if it is detected.
This appendix contains the following test cases:
Writing large, sequential files over an NFS-mounted file system can cause a severe decrease in the file transfer rate to the NFS server. In this scenario, you identify whether this situation exists and use the steps to remedy the problem.
Assume the system is running an application that sequentially writes very large files (larger than the amount of physical memory on the machine) to an NFS-mounted file system. The file system is mounted using NFS V3. The NFS server and client communicate over a 100 MB per second Ethernet network. When sequentially writing a small file, the throughput averages around 10 MB per second. However, when writing a very large file, the throughput average drops to well under 1 MB per second.
The application's large file write is filling all of the client's memory, causing the rate of transfer to the NFS server to decrease. This happens because the client AIX system must invoke the LRUD kproc to release some pages in memory to accommodate the next set of pages being written by the application.
Use either of the following methods to detect if you are experiencing this problem:
nfsstatCheck the nfsstat command output. If the number of V3 commit calls is increasing nearly linearly with the number of V3 write calls, it is likely that this is the problem.
topas -i 1
If either of the methods listed indicate that the problem exists, the solution is to use the new mount command option called combehind when mounting the NFS server file system on the client system. Do the following:
unmount /mnt(assumes /mnt is local mount point)
mount -o combehind server_hostname:/remote_mount_point /mnt
In this scenario, you will identify and correct a system performance problem that could be increasing your backup time when using Tivoli Storage Manager (TSM).
The scenario environment consists of two systems: one TSM client machine used as a database server, and a TSM backup server. The client machine is connected to the TSM backup server with point-to-point Ethernet running at 1 GB per second. The backint application, which is a multi-threaded TSM backup program for SAP database environments, runs on the client using the TSM backup APIs.
After upgrading both the TSM server and client machines from AIX 4.3.3, Maintenance Level 04 to Maintenance Level 09, and upgrading from TSM Version 3 to TSM Version 4, the time to backup the database on the client using the backint application increased from 7 hours to 15 hours.
To determine the cause of the performance degradation and improve TSM backup performance, do the following:
# ftp <hostname_of_TSM_server> ftp> bin ftp> put "| dd if=/dev/zero bs=32k count=10000" /dev/nullIn this scenario, the ftp and dd command output revealed a network throughput of about 81 MB per second, which indicates that the network itself is not an issue. (If the network had been at fault, the netstat tool would have been the next logical step.)
In this scenario, the vmtune parameter is maxpgahead -R because the files were stored in a journaled file system (JFS) and not an enhanced journaled file system (JFS2). The default for this parameter is 8, which means read, at most, eight 4 KB blocks (32 KB total) of data into memory with a single I/O operation. Because the backint application asks to get 128 KB on each read() call, use the following command to tell AIX to read as much as 128 KB in a single I/O when sequential read() operations are being done:
vmtune -R 32
iostat 1 vmstat 1In this example, the list of available pages in AIX (the freelist = fre column in the vmstat command output) was approaching zero at times. AIX will not function smoothly if the freelist gets too small.
and
Tune AIX by running the following command:
vmtune -f 1440 -F 1824
Overall backup time has now decreased to under 7 hours.
The iostat, netstat, ps, vmstat, and vmtune commands.
In this scenario, you will verify that you have a high number of security subroutine processes and then reduce the amount of processor time used for security subroutines by indexing the password file.
The scenario environment consists of one 2-way system used as a mail server. Mail is received remotely through POP3 (Post Office Protocol Version 3) and by local mail client with direct login on the server. Mail is sent by using the sendmail daemon. Because of the nature of a mail server, a high number of security subroutines are called for user authentication. After moving from a uniprocessor machine to the 2-way system, the uptime command returned 200 processes, compared to less than 1 on the uniprocessor machine.
To determine the cause of the performance degradation and reduce the amount of processor time used for security subroutines, do the following:
topas -i 1The topas command output in our scenario indicated that the majority of processor time, about 90%, was spent in user mode and the processes consuming the most processor time were sendmail and pop3d. (Had the majority of processor usage been kernel time, a kernel trace would be the appropriate tool to continue.)
tprof -ske -x "sleep 60"The tprof command lists the names of the subroutines called out of shared libraries and is sorted by the number of processor ticks spent for each subroutine. The tprof data, in this case, showed most of the processor time in user mode was spent in the libc.a system library for the security subroutines (and thoses subroutines called by them). (Had the tprof command shown that user-mode processor time was spent mostly in application code (user), then application debugging and profiling would have been necessary.)
mkpasswd -f
By using an indexed password file, the load average for this scenario was reduced from a value of 200 to 0.6.