[ Previous | Next | Table of Contents | Index | Library Home | Legal | Search ]

Performance Management Guide


Identifying CPU-Intensive Programs

To locate the processes dominating CPU usage, there are two standard tools, the ps command and the acctcom command. Another tool to use is the topas monitor, which is described in Using the topas Monitor.

Using the ps Command

The ps command is a flexible tool for identifying the programs that are running on the system and the resources they are using. It displays statistics and status information about processes on the system, such as process or thread ID, I/O activity, CPU and memory utilization. In this chapter, we discuss only the options and output fields that are relevant for CPU.

Three of the possible ps output columns report CPU use, each in a different way.

Column
Value Is:

C
Recently used CPU time for the process (in units of clock ticks).

TIME
Total CPU time used by the process since it started (in units of minutes and seconds).

%CPU
Total CPU time used by the process since it started, divided by the elapsed time since the process started. This is a measure of the CPU dependence of the program.

CPU Intensive

The following shell script:

# ps -ef | egrep -v "STIME|$LOGNAME" | sort +3 -r | head -n 15

is a tool for focusing on the highest recently used CPU-intensive user processes in the system (the header line has been reinserted for clarity):

     UID   PID  PPID   C    STIME    TTY  TIME CMD
    mary 45742 54702 120 15:19:05 pts/29  0:02 ./looper
    root 52122     1  11 15:32:33 pts/31 58:39 xhogger
    root  4250     1   3 15:32:33 pts/31 26:03 xmconsole allcon
    root 38812  4250   1 15:32:34 pts/31  8:58 xmconstats 0 3 30
    root 27036  6864   1 15:18:35      -  0:00 rlogind
    root 47418 25926   0 17:04:26      -  0:00 coelogin <d29dbms:0>
    bick 37652 43538   0 16:58:40  pts/4  0:00 /bin/ksh
    bick 43538     1   0 16:58:38      -  0:07 aixterm
     luc 60062 27036   0 15:18:35 pts/18  0:00 -ksh

Recent CPU use is the fourth column (C). The looping program's process easily heads the list. Observe that the C value may understate the looping process' CPU usage, because the scheduler stops counting at 120.

CPU Time Ratio

The ps command, run periodically, displays the CPU time under the TIME column and the ratio of CPU time to real time under the %CPU column. Look for the processes that dominate usage. The au and v options give similar information on user processes. The options aux and vg display both user and system processes.

The following example is taken from a four-way SMP system:

# ps au
USER       PID %CPU %MEM   SZ  RSS    TTY STAT    STIME TIME COMMAND
root     19048 24.6  0.0   28   44  pts/1 A    13:53:00  2:16 /tmp/cpubound
root     19388  0.0  0.0  372  460  pts/1 A      Feb 20  0:02 -ksh
root     15348  0.0  0.0  372  460  pts/4 A      Feb 20  0:01 -ksh
root     20418  0.0  0.0  368  452  pts/3 A      Feb 20  0:01 -ksh
root     16178  0.0  0.0  292  364      0 A      Feb 19  0:00 /usr/sbin/getty
root     16780  0.0  0.0  364  392  pts/2 A      Feb 19  0:00 -ksh
root     18516  0.0  0.0  360  412  pts/0 A      Feb 20  0:00 -ksh
root     15746  0.0  0.0  212  268  pts/1 A    13:55:18  0:00 ps au

The %CPU is the percentage of CPU time that has been allocated to that process since the process was started. It is calculated as follows:

(process CPU time / process duration) * 100

Imagine two processes: The first starts and runs five seconds, but does not finish; then the second starts and runs five seconds but does not finish. The ps command would now show 50 percent %CPU for the first process (five seconds CPU for 10 seconds of elapsed time) and 100 percent for the second (five seconds CPU for five seconds of elapsed time).

On an SMP, this value is divided by the number of available CPUs on the system. Looking back at the previous example, this is the reason why the %CPU value for the cpubound process will never exceed 25, because the example is run on a four-way processor system. The cpubound process uses 100 percent of one processor, but the %CPU value is divided by the number of available CPUs.

The THREAD Option

The ps command can display threads and the CPUs that threads or processes are bound to by using the ps -mo THREAD command. The following is an example:

# ps -mo THREAD
USER PID   PPID  TID   ST CP PRI SC WCHAN F      TT    BND COMMAND
root 20918 20660 -     A  0  60  1  -     240001 pts/1 -   -ksh
-    -     -     20005 S  0  60  1  -     400    -     -   -

The TID column shows the thread ID, the BND column shows processes and threads bound to a processor.

It is normal to see a process named kproc (PID of 516 in operating system version 4) using CPU time. When there are no threads that can be run during a time slice, the scheduler assigns the CPU time for that time slice to this kernel process (kproc), which is known as the idle or wait kproc. SMP systems will have an idle kproc for each processor. With operating system versions later than AIX 4.3.3 the name shown in the output is wait.

For complete details about the ps command, see the AIX 5L Version 5.1 Commands Reference.

Using the acctcom Command

The acctcom command displays historical data on CPU usage if the accounting system is activated. Starting the accounting system puts a measurable overhead on the system. Therefore, activate accounting only if absolutely needed. To activate the accounting system, do the following:

  1. Create an empty accounting file:

    # touch acctfile
    
  2. Turn on accounting:

    # /usr/sbin/acct/accton acctfile
    
  3. Allow accounting to run for a while and then turn off accounting:

    # /usr/sbin/acct/accton
    
  4. Display what accounting captured, as follows:

    # /usr/sbin/acct/acctcom acctfile
    COMMAND                      START    END          REAL      CPU     MEAN
    NAME       USER     TTYNAME  TIME     TIME       (SECS)   (SECS)  SIZE(K)
    #accton    root     pts/2   19:57:18 19:57:18     0.02     0.02   184.00
    #ps        root     pts/2   19:57:19 19:57:19     0.19     0.17    35.00
    #ls        root     pts/2   19:57:20 19:57:20     0.09     0.03   109.00
    #ps        root     pts/2   19:57:22 19:57:22     0.19     0.17    34.00
    #accton    root     pts/2   20:04:17 20:04:17     0.00     0.00     0.00
    #who       root     pts/2   20:04:19 20:04:19     0.02     0.02     0.00
    

If you reuse the same file, you can see when the newer processes were started by looking for the accton process (this was the process used to turn off accounting the first time).


[ Previous | Next | Table of Contents | Index | Library Home | Legal | Search ]