Debugging Runaway Processes or High System Loads Runaway processes, or a high system load in general, can be symptomatic of a problem with how an application is coded. This procedure is only intended to be used when "ps -efl" does not reveal a process with a high CPU usage. Process Accounting - Tells you which processes are being run, how much CPU time is being consumed, and the starting and ending time of the commands. Process accounting can be used when processes are being created and destroyed over a fairly short interval. The appropriate commands are * lastcomm Gives the commands with their execution time, user ID, terminal name and start time. The output is in reverse order - the most recently executed commands are first. This output can be examined by paging the data through "more" or "pg". Commands which appear frequently and for a single user can be indicative of a problem with an application which that user is executing. * acctcom Acctcom provides the same information as "lastcomm", but has the ability to select based on command name, terminal name, user ID, and a number of other criteria. You may select records by starting or ending time, from the beginning of the data file going forward, or from the end of the data file going backward. If the "lastcomm" command gives you some indication that a particular user is performing a large number of operations, you may then use "acctcom -u " to view the processes which that user is creating. If the "lastcomm" command shows that a particular command is being executed frequently or consuming a large amount of resource, you would use "acctcom -n " Prior to using these commands the "/usr/lib/acct/turnacct on" command must have been used to enable system accounting. If the file /usr/adm/pacct exists and is increasing in size, system process accounting is currently active. Kernel Function Trace - This tells you about all of the system calls and scheduler activity on the system. Process creation and destruction is displayed using the trace events (also known as hookwords) for fork, exec, and exit. The hookword IDs are * HKWD_SYSC_EXECVE 134 * HKWD_SYSC__EXIT 135 * HKWD_SYSC_FORK 139 These definitions are given in /usr/include/sys/trchkid.h and all values are in hexidecimal. You must also include a number of "administrative" hookwords so that the trace subsystem can properly track which process is active. Those hookword IDs are * HKWD_TRACE_TRCON 001 * HKWD_TRACE_TRCOFF 002 * HKWD_KERN_DISPATCH 106 * HKWD_KERN_IDLE 10C These four IDs must also be included. In addition, if you are not tracing fork() and exec() functions, you must still include IDs 134 and 139. If you are interested in other kernel functions, you may include those hookword IDs in your trace command request. There are two ways to run tracing, interactive and non-interactive. In interactive mode you issue trace commands to start and stop tracing from within the trace command. In non-interactive mode (also called "asynchronous mode"), you use the "trcon" command to begin logging trace events, "trcoff" to stop logging trace events, and the "trcstop" command to cause the trace daemon to exit. In interactive mode, the "trcon" subcommand begins tracing, "trcoff" stops tracing, and "q" exits from the interactive session. Security Audit Events - This tells you about system functions which have security implications. Because process creation and destruction are relevant to system security, you can use the PROC_Create, PROC_Execute and PROC_Delete auditing events to monitor process creation. For more information on setting up security auditing, refer to the topic "Auditing -- Periodic and System Management" in InfoExplorer. The section "Work with the audit system" will help you configure the system properly. You will want o create an "audit class" called "proc" which contains only the events you are concerned with. To do this, add the following line in the "classes" stanza in the /etc/security/audit/config file after the "kernel" line. proc = PROC_Create,PROC_Delete,PROC_Execute You will then add this class to the user which you wish to track. This can be performed from the SMIT fastpath "smit chuser". The field to change is "AUDIT Classes". You must add "proc" to each user. You may then retrieve and examine the audit data as described in the InfoExplorer article mentioned earlier. -- John F. Haugh II | Quality is ... being able | MaBellNet: (512) 823-8817 SneakerNet: 042/2F068 | to rely on your customer | VNET: HAUGH at AUSVM8 [ DoF #17 ] | for your next paycheck. | Disc: I speak 4 me, !IBM.