The trace facility helps you isolate system problems by monitoring selected system events. Events that can be monitored include: entry and exit to selected subroutines, kernel routines, kernel extension routines, and interrupt handlers. When the trace facility is active, information about these events is recorded in a system trace log file. The trace facility includes commands for activating and controlling traces and generating trace reports. Applications and kernel extensions can use several subroutines to record additional events.
For more information on the trace facility, refer to the following:
The trace facility is in the bos.sysmgt.trace file set. To see if this file set is installed, use
lslpp -l | grep bos.sysmgt.trace
If a line is produced which includes bos.sysmgt.trace then the file set is installed, otherwise you must install it.
The following topics are discussed:
The AIX system trace facility records trace events which can be formatted later by the trace report command. Trace events are compiled into kernel or application code, but are only traced if tracing is active.
Tracing is activated with the trace command or the trcstart subroutine. Tracing is stopped with either the trcstop command or the trcstop subroutine. While active, tracing can be suspended or resumed with the trcoff and trcon commands, or the trcoff and trcon subroutines.
Once the trace has been stopped with trcstop, a trace report can then be generated with the trcrpt command. This command uses a template file, /etc/trcfmt, to know how to format the entries. The templates are installed with the trcupdate command. For a discussion of the templates, see the trcupdate command.
The trace command starts the tracing of system events and controls the trace buffer and log file sizes. This command is documented in the article on the trace daemon in the Command's Reference.
There are three methods of gathering trace data.
You will usually want to run the trace command asynchronously, in other words, you want to enter the trace command and then continue with other work. To run the trace asynchronously, use the -a flag. You must then stop the trace with the trcstop command.
It is usually desirable to limit the information that is traced. Use the -j events or -k events flags to specify a set of events to include (-j) or exclude (-k). Note, however, that to be able to display the program names associated with trace hooks, certain hooks must be enabled. These are given in the trace command article. These hooks may also be enabled by selecting the tidhk trace hook group from SMIT.
For example, if you want to trace the mbuf hook, 254, and show program names also, you need to run trace as follows:
trace -aj 106,10c,134,139,465,254
trcrpt -O exec=on
The hooks needed for showing the program name may change from release to release, so please check the trace command article. The -O exec=on trcrpt option shows the program names, see the trcrpt command.
It is often desirable to specify the buffer size and the maximum log file size. The trace buffers require real memory to be available so that no paging is necessary to record trace hooks. The log file will fill to the maximum size specified, and then wrap around, discarding the oldest trace data. The -T size and -L size flags specify the size of the memory buffers and the maximum size of the trace data in the log file in bytes.
Note: Because the trace facility pins the data collection buffers, making this amount of memory unavailable to the rest of the system, the trace facility can impact performance in a memory-constrained environment. If the application being monitored is not memory-constrained, or if the percentage of memory consumed by the trace routine is small compared to what is available in the system, the impact of trace "stolen" memory should be small. If you do not specify a value, trace uses the default sizes.
It should also be noted that the buffer memory is allocated from the kernel heap by default. If there isn't enough available memory, it is allocated from separate segments. On a busy system it may be desirable to allocate the buffers from separate memory segments to avoid constraining the kernel heap. This is done with the -B flag on the trace command.
Tracing may also be controlled from an application. See the trcstart, and trcstop articles.
There are two types of trace data.
See the trcrpt command article for a full description of trcrpt. This command is used to generate a readable trace report from the log file generated by the trace command. By default the command formats data from the default log file, /var/adm/ras/trcfile. The trcrpt output is written to standard output.
To generate a trace report from the default log file, and write it to /tmp/rptout, enter
To generate a trace report from the log file /tmp/tlog to /tmp/rptout, which includes program names and system call names, use
trcrpt -O exec=on,svc=on /tmp/tlog >/tmp/rptout
If trace was active when the system takes a dump, the trace can usually be retrieved with the trcdead command. To avoid overwriting the default trace log file on the current system, use the -o output-file option.
trcdead /o /tmp/tlog /var/adm/ras/vmcore.0
creates a trace log file /tmp/tlog which may then be formatted with
The following commands are part
of the trace facility:
|trace||Starts the tracing of system events. With this command, you can control the size and manage the trace log file as well as the internal trace buffers that collect trace event data.|
|trcdead||Extracts trace information from a system dump. If the system halts while the trace facilities are active, the contents of the internal trace buffers are captured. This command extracts the trace event data from the dump and writes it to the trace log file.|
|trcnm||Generates a kernel name list used by the trcrpt command. A kernel name list is composed of a symbol table and a loader symbol table of an object file. The trcrpt command ues the kernel name list file to inerpret addresses when formatting a report from a trace log file.|
|trcrpt||Formats reports of trace event data contained in the trace log file. You can specify the events to be included (or omitted) in the report, as well as determine the presentation of the output with this command. The trcrpt command uses the trace formatting templates stored in the /etc/trcfmt file to determine how to interpret the data recorded for each event.|
|trcstop||Stops the tracing of system events.|
|trcupdate||Updates the trace formatting templates stored in the /etc/trcfmt file. When you add applications or kernel extensions that record trace events, templates for these events must be added to the /etc/trcfmt file. The trcrpt command will use the trace formatting templates to determine how to interpret the data recorded for each event. Software products that record events usually run the trcupdate command as part of the installation process.|
The following calls and
subroutines are part of the trace facility:
|trcgen, trcgent||Records trace events of more than five words of data. The trcgen subroutine may be used to record an event as part of the system event trace (trace channel 0) or to record an event on a generic trace channel (channels 1 through 7). You specify the channel number in a subroutine parameter when you record the trace event. The trcgent subroutine appends a time stamp to the event data.|
|trchook, utrchook||Records trace events of up to five words of data. These subroutines may be used to record an event as part of the system event trace (trace channel 0). The utrchook subroutine uses a special FAST-SVC path to improve performance and should be used by programs at the user (application) level.|
|trcgenk, trcgenkt||Records trace events of more than five words of data. The trcgenk subroutine may be used to record an event as part of the system event trace (trace channel 0) or to record an event on a generic trace channel (channels 1 through 7). You specify the channel number in a subroutine parameter when you record the trace event. The trcgenkt subroutine appends a time stamp to the event data.|
|trcoff||Suspends the collection of trace data on either the system event trace channel (channel 0) or a generic trace channel (1 through 7). The trace channel remains active and trace data collection can be resumed by using the trcon subroutine.|
|trcon||Starts the collection of trace data on a trace channel. The channel may be either the system event trace channel (0) or a generic channel (1 through 7). The trace channel, however, must have been previously activated by using the trace command or the trcstart subroutine. You can suspend trace data collection by using the trcoff subroutine.|
|trcstart||Requests a generic trace channel. This subroutine activates a generic trace channel and returns the channel number to the calling application to use in recording trace events using the trcgen, trcgent, trcgenk, and trcgenkt subroutines.|
|trcstop||Frees and deactivates a generic trace channel.|
|/etc/trcfmt||Contains the trace formatting templates used by the trcrpt command to determine how to interpret the data recorded for each event.|
|/var/adm/ras/trcfile||Contains the default trace log file. The trace command allows you to specify a different trace log file.|
|/usr/include/sys/trchkid.h||Contains trace hook identifier definitions.|
|/usr/include/sys/trcmacros.h||Contains commonly used macros for recording trace events.|
The data recorded for each traced event consist of a word containing the trace hook identifier and the hook type followed by a variable number of words of trace data optionally followed by a time stamp. The word containing the trace hook identifier and the hook type is called the hook word. The remaining two bytes of the hook word are called hook data and are available for recording event data.
A trace hook identifier is a three-digit hexadecimal number that identifies an event being traced. You specify the trace hook identifier in the first twelve bits of the hook word. Trace hook identifiers are defined in the /usr/include/sys/trchkid.h file. The values 0x010 through 0x0FF are available for use by user applications. All other values are reserved for system use. The currently defined trace hook identifiers can be listed using the trcrpt -j command.
The hook type identifies the composition of the event data and is user-specified. The twelfth through the sixteenth bits of the hook word constitute the hook type. For more information on hook types, refer to the trcgen, trcgenk, and trchook subroutines.
The trace facility supports up to eight active trace sessions at a time. Each trace session uses a channel of the multiplexed trace special file, /dev/systrace. Channel 0 is used by the trace facility to record system events. The tracing of system events is started and stopped by the trace and trcstop commands. Channels 1 through 7 are referred to as generic trace channels and may be used by subsystems for other types of tracing such as data link tracing.
To implement tracing using the generic trace channels of the trace facility, a subsystem calls the trcstart subroutine to activate a trace channel and to determine the channel number. The subsystem modules can then record trace events using the trcgen, trcgent, trcgenk, or trcgenkt subroutine. The channel number returned by the trcstart subroutine is one of the parameters that must be passed to these subroutines. The subsystem can suspend and resume trace data collection using the trcoff and trcon subroutines and can deactivate a trace channel using the trcstop subroutine. The trace events for each channel must be written to a separate trace log file, which must be specified in the call to the trcstart subroutine. The subsystem must provide the user interface to activating and deactivating subsystem tracing.
The trace hook IDs, which are stored in the /usr/include/sys/trchkid.h file, and the trace formatting templates, which are stored in the /etc/trcfmt file, are shared by all the trace channels.
The trace daemon in AIX 5L Version 5.1 Commands Reference.
The trcdead command, trcnm command, trcrpt command, trcstop command, trcupdate command in AIX 5L Version 5.1 Commands Reference.
The trchook subroutine, trcgen subroutine, trcoff subroutine, trcon subroutine, trcstart subroutine, trcstop subroutine in AIX 5L Version 5.1 Technical Reference: Base Operating System and Extensions Volume 2.