This section describes procedures that you, as the system administrator, can use to manage processes. See Monitoring and Managing Processes in the AIX 5L Version 5.2 System Management Concepts: Operating System and Devices and see also AIX 5L Version 5.2 System User's Guide: Operating System and Devices for basic information on managing your own processes; for example, restarting or stopping a process that you started or scheduling a process for a later time.
The ps command is the primary tool for observing the processes in the system. Most of the flags of the ps command fall into one of two categories:
The most widely useful variants of ps for system-management purposes are:
To identify the current heaviest users of CPU time, you could enter:
ps -ef | egrep -v "STIME|$LOGNAME" | sort +3 -r | head -n 15
This command lists, in descending order, the 15 most CPU-intensive processes other than those owned by you.
For more specialized uses, the following two tables are intended to simplify the task of choosing ps flags by summarizing the effects of the flags.
-A | -a | -d | -e | -G
-g |
-k | -p | -t | -U
-u |
a | g | t | x | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
All processes | Y | - | - | - | - | - | - | - | - | - | Y | - | - |
Not processes group leaders and not associated with a terminal | - | Y | - | - | - | - | - | - | - | - | - | - | - |
Not process group leaders | - | - | Y | - | - | - | - | - | - | - | - | - | - |
Not kernel processes | - | - | - | Y | - | - | - | - | - | - | - | - | - |
Members of specified-process groups | - | - | - | - | Y | - | - | - | - | - | - | - | - |
Kernel processes | - | - | - | - | - | Y | - | - | - | - | - | - | - |
Those specified in process number list | - | - | - | - | - | - | Y | - | - | - | - | - | - |
Those associated with tty(s) in the list | - | - | - | - | - | - | - | Y
(n ttys) |
- | - | - | Y
(1 tty) |
- |
Specified user processes | - | - | - | - | - | - | - | - | Y | - | - | - | - |
Processes with terminals | - | - | - | - | - | - | - | - | - | Y | - | - | - |
Not associated with a tty | - | - | - | - | - | - | - | - | - | - | - | - | Y |
Default1 | -f | -l | -U
-u |
Default2 | e | l | s | u | v | |
---|---|---|---|---|---|---|---|---|---|---|
PID | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
TTY | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
TIME | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
CMD | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
USER | - | Y | - | - | - | - | - | - | Y | - |
UID | - | - | Y | Y | - | - | Y | - | - | - |
PPID | - | Y | Y | - | - | - | Y | - | - | - |
C | - | Y | Y | - | - | - | Y | - | - | - |
STIME | - | Y | - | - | - | - | - | - | Y | - |
F | - | - | Y | - | - | - | - | - | - | - |
S/STAT | - | - | Y | - | Y | Y | Y | Y | Y | Y |
PIR | - | - | Y | - | - | - | Y | - | - | - |
NI/NICE | - | - | Y | - | - | - | Y | - | - | - |
ADDR | - | - | Y | - | - | - | Y | - | - | - |
SIZE | - | - | - | - | - | - | - | - | Y | - |
SZ | - | Y | - | - | - | Y | - | Y | - | - |
WCHAN | - | - | Y | - | - | - | Y | - | - | - |
RSS | - | - | - | - | - | - | Y | - | Y | Y |
SSIZ | - | - | - | - | - | - | - | Y | - | - |
%CPU | - | - | - | - | - | - | - | - | Y | Y |
%MEM | - | - | - | - | - | - | - | - | Y | Y |
PGIN | - | - | - | - | - | - | - | - | - | Y |
LIM | - | - | - | - | - | - | - | - | - | Y |
TSIZ | - | - | - | - | - | - | - | - | - | Y |
TRS | - | - | - | - | - | - | - | - | - | Y |
Environment
(following the command) |
- | - | - | - | - | Y | - | - | - | - |
If ps is given with no flags or with a process-specifying flag that begins with a minus sign, the columns displayed are those shown for Default1. If the command is given with a process-specifying flag that does not begin with minus, Default2 columns are displayed. The -u or -U flag is both a process-specifying and column-selecting flag.
The following are brief descriptions of the contents of the columns:
PID | Process ID |
TTY | Terminal or pseudo-terminal associated with the process |
TIME | Cumulative CPU time consumed, in minutes and seconds |
CMD | Command the process is running |
USER | Login name of the user to whom the process belongs |
UID | Numeric user ID of the user to whom the process belongs |
PPID | ID of the parent process of this process |
C | Recently used CPU time |
STIME | Time the process started, if less than 24 hours. Otherwise the date the process is started |
F | Eight-character hexadecimal value describing the flags associated with the process (see the detailed description of the ps command) |
S/STAT | Status of the process (see the detailed description of the ps command) |
PRI | Current priority value of the process |
NI/NICE | Nice value for the process |
ADDR | Segment number of the process stack |
SIZE | (-v flag) The virtual size of the data section of the process (in kilobytes) |
SZ | (-l and l flags) The size in kilobytes of the core image of the process. |
WCHAN | Event on which the process is waiting |
RSS | Sum of the numbers of working-segment and code-segment pages in memory times 4 |
SSIZ | Size of the kernel stack |
%CPU | Percentage of time since the process started that it was using the CPU |
%MEM | Nominally, the percentage of real memory being used by the process, this measure does not correlate with any other memory statistics |
PGIN | Number of page ins caused by page faults. Since all I/O is classified as page faults, this is basically a measure of I/O volume |
LIM | Always xx |
TSIZ | Size of the text section of the executable file |
TRS | Number of code-segment pages times 4 |
Environment | Value of all the environment variables for the process |
Basically, if you have identified a process that is using too much CPU time, you can reduce its effective priority by increasing its nice value with the renice command. For example:
renice +5 ProcID
The nice value of the ProcID's would increase process from the normal 20 of a foreground process to 25. You must have root authority to reset the process ProcID's nice value to 20. Type:
renice -5 ProcID
Normally, you use the kill command to end a process. The kill command sends a signal to the designated process. Depending on the type of signal and the nature of the program that is running in the process, the process might end or might keep running. The signals you send are:
SIGTERM | (signal 15) is a request to the program to terminate. If the program has a signal handler for SIGTERM that does not actually terminate the application, this kill may have no effect. This is the default signal sent by kill. |
SIGKILL | (signal 9) is a directive to kill the process immediately. This signal cannot be caught or ignored. |
It is typically better to issue SIGTERM rather than SIGKILL. If the program has a handler for SIGTERM, it can clean up and terminate in an orderly fashion. Type:
kill -term ProcessID
(The -term could be omitted.) If the process does not respond to the SIGTERM, type:
kill -kill ProcessID
You might notice occasional defunct processes, also called zombies, in your process table. These processes are no longer executing, have no system space allocated, but still retain their PID number. You can recognize a zombie process in the process table because it displays <defunct> in the CMD column. For example:
UID PID PPID C STIME TTY TIME CMD . . . lee 22392 20682 0 Jul 10 - 0:05 xclock lee 22536 21188 0 Jul 10 pts/0 0:00 /bin/ksh lee 22918 24334 0 Jul 10 pts/1 0:00 /bin/ksh lee 23526 22536 22 0:00 <defunct> lee 24334 20682 0 Jul 10 ? 0:00 aixterm lee 24700 1 0 Jul 16 ? 0:00 aixterm root 25394 26792 2 Jul 16 pts/2 0:00 ksh lee 26070 24700 0 Jul 16 pts/3 0:00 /bin/ksh lee 26792 20082 0 Jul 10 pts/2 0:00 /bin/ksh root 27024 25394 2 17:10:44 pts/2 0:00 ps -ef
Zombie processes continue to exist in the process table until the parent process dies or the system is shut down and restarted. In the example shown above, the parent process (PPID) is the ksh command. When the Korn shell is exited, the defunct process is removed from the process table.
Sometimes a number of these defunct processes collect in your process table because an application has forked several child processes and has not exited. If this becomes a problem, the simplest solution is to modify the application so its sigaction subroutine ignores the SIGCHLD signal. For more information, see the sigaction subroutine in AIX 5L Version 5.2 Technical Reference: Base Operating System and Extensions Volume 2.
On multiprocessor systems, you can bind a process to a processor or unbind a previously bound process from:
You must have root user authority to bind or unbind a process you do not own.
Task | SMIT Fast Path | Command or File |
---|---|---|
Binding a Process | smit bindproc | bindprocessor -q |
Unbinding a Process | smit ubindproc | bindprocessor -u |
Stalled or unwanted processes can cause problems with your terminal. Some problems produce messages on your screen that give information about possible causes.
To perform the following procedures, you must have either a second terminal, a modem, or a network login. If you do not have any of these, fix the terminal problem by rebooting your machine.
Choose the appropriate procedure for fixing your terminal problem:
Identify and stop stalled or unwanted processes by doing the following:
ps -ef | pg
The ps command shows the process status. The -e flag writes information about all processes (except kernel processes), and the f flag generates a full listing of processes including what the command name and parameters were when the process was created. The pg command limits output to a single page at a time, so information does not quickly scroll off the screen.
Suspicious processes include system or user processes that use up excessive amounts of a system resource such as CPU or disk space. System processes such as sendmail, routed, and lpd frequently become runaways. Use the ps -u command to check CPU usage.
who
The who command displays information about all users currently on this system, such as login name, workstation name, date, and time of login.
kill 1883
The kill command sends a signal to a running process. To stop a process, specify the process ID (PID), which is 1883 in this example. Use the ps command to determine the PID number of commands.
/u/bin1/prog1 &
The & signals that you want this process to run in the background. In a background process, the shell does not wait for the command to complete before returning the shell prompt. When a process requires more than a few seconds to complete, run the command in background by typing an & at the end of the command line. Jobs running in the background appear in the normal ps command.
renice 20 1883
The renice command alters the scheduling priority of one or more running processes. The higher the number, the lower the priority with 20 being the lowest priority.
In the previous example, renice reschedules process number 1883 to the lowest priority. It will run when there is a small amount of unused processor time available.
Respond to and recover from screen messages by doing the following:
setsenv
The setsenv command displays the protected state environment when you logged in.
Determine if the DISPLAY variable has been set. In the following example, the DISPLAY variable does not appear, which indicates that the DISPLAY variable is not set to a specific value.
SYSENVIRON: NAME=casey TTY=/dev/pts/5 LOGNAME=casey LOGIN=casey
OR
DISPLAY=bastet:0 export DISPLAY
If not specifically set, the DISPLAY environment variable defaults to unix:0 (the console). The value of the variable is in the format name:number where name is the host name of a particular machine, and number is the X server number on the named system.
stty sane
The stty sane command restores the "sanity" of the terminal drivers. The command outputs an appropriate terminal resetting code from the /etc/termcap file (or /usr/share/lib/terminfo if available).
^J stty sane ^J
The ^J represents the Ctrl-J key sequence.
The use of multiple queues increases the processor affinity of threads, but there is a special situation where you might want to counteract this effect. When there is only one run queue, a thread that has been awakened (the waking thread) by another running thread (the waker thread) would normally be able to use the CPU immediately on which the waker thread was running. With multiple run queues, the waking thread may be on the run queue of another CPU which cannot notice the waking thread until the next scheduling decision. This may result in up to a 10 ms delay.
This is similar to scenarios in earlier releases of this operating system which migjht have occurred using the bindprocessor option. If all CPUs are constantly busy, and there are a number of interdependent threads waking up, there are two options available.