The trace facility helps you isolate system problems by monitoring selected system events. Events that can be monitored include: entry and exit to selected subroutines, kernel routines, kernel extension routines, and interrupt handlers. When the trace facility is active, information about these events is recorded in a system trace log file. The trace facility includes commands for activating and controlling traces and generating trace reports. Applications and kernel extensions can use several subroutines to record additional events.
For more information on the trace facility, refer to the following:
The trace facility is in the bos.sysmgt.trace file set. To see if this file set is installed, type the following on the command line:
lslpp -l | grep bos.sysmgt.trace
If a line is produced which includes bos.sysmgt.trace then the file set is installed, otherwise you must install it.
The system trace facility records trace events which can be formatted later by the trace report command. Trace events are compiled into kernel or application code, but are only traced if tracing is active.
Tracing is activated with the trace command or the trcstart subroutine. Tracing is stopped with either the trcstop command or the trcstop subroutine. While active, tracing can be suspended or resumed with the trcoff and trcon commands, or the trcoff and trcon subroutines.
Once the trace has been stopped with trcstop, a trace report can then be generated with the trcrpt command. This command uses a template file, /etc/trcfmt, to know how to format the entries. The templates are installed with the trcupdate command. For a discussion of the templates, see the trcupdate command.
The trace command starts the tracing of system events and controls the trace buffer and log file sizes. This command is documented in the article on the trace daemon in the Command's Reference.
There are three methods of gathering trace data.
You will usually want to run the trace command asynchronously, in other words, you want to enter the trace command and then continue with other work. To run the trace asynchronously, use the -a flag. You must then stop the trace with the trcstop command.
It is usually desirable to limit the information that is traced. Use the -j events or -k events flags to specify a set of events to include (-j) or exclude (-k).
To display the program names associated with trace hooks, certain hooks must be enabled. These are specified using the tidhk trace event group. For example, if you want to trace the mbuf hook, 254, and show program names also, you need to run trace as follows:
trace -aJ tidhk -j 254
Tracing occurs. To stop tracing, type the follwing on a command line:
trcstop trcrpt -O exec=on
The -O exec=on trcrpt option shows the program names, see the trcrpt command for more information.
It is often desirable to specify the buffer size and the maximum log file size. The trace buffers require real memory to be available so that no paging is necessary to record trace hooks. The log file will fill to the maximum size specified, and then wrap around, discarding the oldest trace data. The -T size and -L size flags specify the size of the memory buffers and the maximum size of the trace data in the log file in bytes.
Tracing can also be controlled from an application. See the trcstart, and trcstop articles.
There are two types of trace data.
See the trcrpt command article for a full description of trcrpt. This command is used to generate a readable trace report from the log file generated by the trace command. By default the command formats data from the default log file, /var/adm/ras/trcfile. The trcrpt output is written to standard output.
To generate a trace report from the default log file, and write it to /tmp/rptout, enter
trcrpt >/tmp/rptout
To generate a trace report from the log file /tmp/tlog to /tmp/rptout, which includes program names and system call names, use
trcrpt -O exec=on,svc=on /tmp/tlog >/tmp/rptout
If trace was active when the system takes a dump, the trace can usually be retrieved with the trcdead command. To avoid overwriting the default trace log file on the current system, use the -o output-file option.
For example:
trcdead /o /tmp/tlog /var/adm/ras/vmcore.0
creates a trace log file /tmp/tlog which may then be formatted with the following:
trcrpt /tmp/tlog
The following commands are part of the trace facility:
trace | Starts the tracing of system events. With this command, you can control the size and manage the trace log file as well as the internal trace buffers that collect trace event data. |
trcdead | Extracts trace information from a system dump. If the system halts while the trace facilities are active, the contents of the internal trace buffers are captured. This command extracts the trace event data from the dump and writes it to the trace log file. |
trcnm | Generates a kernel name list used by the trcrpt command. A kernel name list is composed of a symbol table and a loader
symbol table of an object file. The trcrpt command uses
the kernel name list file to interpret addresses when formatting a report
from a trace log file.
Note
It is recommended that you use the -n trace option instead of trcnm.
This puts name list information into the trace log file instead of a separate
file, and includes symbols from kernel extentions. |
trcrpt | Formats reports of trace event data contained in the trace log file. You can specify the events to be included (or omitted) in the report, as well as determine the presentation of the output with this command. The trcrpt command uses the trace formatting templates stored in the /etc/trcfmt file to determine how to interpret the data recorded for each event. |
trcstop | Stops the tracing of system events. |
trcupdate | Updates the trace formatting templates stored in the /etc/trcfmt file. When you add applications or kernel extensions that record trace events, templates for these events must be added to the /etc/trcfmt file. The trcrpt command will use the trace formatting templates to determine how to interpret the data recorded for each event. Software products that record events usually run the trcupdate command as part of the installation process. |
The following calls and subroutines are part of the trace facility:
trcgen, trcgent | Records trace events of more than five words of data. The trcgen subroutine can be used to record an event as part of the system event trace (trace channel 0) or to record an event on a generic trace channel (channels 1 through 7). Specify the channel number in a subroutine parameter when you record the trace event. The trcgent subroutine appends a time stamp to the event data. Use trcgenk and trcgenkt in the kernel. C programmers should always use the TRCGEN and TRCGENK macros. |
utrchook, utrchook64 | Records trace events of up to five words of data. These subroutines
can be used to record an event as part of the system event trace (trace channel
0). Kernel programmers can use trchook and trchook64. C programmers should always use the TRCHKL0 - TRCHKL5 and TRCHKL0T - TRCHKL5T macros.
If you are not using these macros, you need to build your own trace hook word. The format is documented with the /etc/trcfmt file. Note that the 32-bit and 64-bit traces have different hook word formats. |
trcoff | Suspends the collection of trace data on either the system event trace channel (channel 0) or a generic trace channel (1 through 7). The trace channel remains active and trace data collection can be resumed by using the trcon subroutine. |
trcon | Starts the collection of trace data on a trace channel. The channel may be either the system event trace channel (0) or a generic channel (1 through 7). The trace channel, however, must have been previously activated by using the trace command or the trcstart subroutine. You can suspend trace data collection by using the trcoff subroutine. |
trcstart | Requests a generic trace channel. This subroutine activates a generic trace channel and returns the channel number to the calling application to use in recording trace events using the trcgen, trcgent, trcgenk, and trcgenkt subroutines. |
trcstop | Frees and deactivates a generic trace channel. |
/etc/trcfmt | Contains the trace formatting templates used by the trcrpt command to determine how to interpret the data recorded for each event. |
/var/adm/ras/trcfile | Contains the default trace log file. The trace command allows you to specify a different trace log file. |
/usr/include/sys/trchkid.h | Contains trace hook identifier definitions. |
/usr/include/sys/trcmacros.h | Contains commonly used macros for recording trace events. |
See the /etc/trcfmt file for the format of the trace event data.
A trace hook identifier is a three-digit hexadecimal number that identifies an event being traced. You specify the trace hook identifier in the first twelve bits of the hook word. Most trace hook identifiers are defined in the /usr/include/sys/trchkid.h file. The values 0x010 through 0x0FF are available for use by user applications. All other values are reserved for system use. The currently defined trace hook identifiers can be listed using the trcrpt -j command.
The hook type identifies the composition of the event data and is user-specified. The twelfth through the sixteenth bits of the hook word constitute the hook type. For more information on hook types, refer to the trcgen, trcgenk, and trchook subroutines.
The trace facility supports up to eight active trace sessions at a time. Each trace session uses a channel of the multiplexed trace special file, /dev/systrace. Channel 0 is used by the trace facility to record system events. The tracing of system events is started and stopped by the trace and trcstop commands. Channels 1 through 7 are referred to as generic trace channels and may be used by subsystems for other types of tracing such as data link tracing.
To implement tracing using the generic trace channels of the trace facility, a subsystem calls the trcstart subroutine to activate a trace channel and to determine the channel number. The subsystem modules can then record trace events using the TRCGEN or TRCGENT macros, or if necessary, trcgen, trcgent, trcgenk, or trcgenkt subroutine. The channel number returned by the trcstart subroutine is one of the parameters that must be passed to these subroutines. The subsystem can suspend and resume trace data collection using the trcoff and trcon subroutines and can deactivate a trace channel using the trcstop subroutine. The trace events for each channel must be written to a separate trace log file, which must be specified in the call to the trcstart subroutine. The subsystem must provide the user interface to activating and deactivating subsystem tracing.
The trace hook IDs, most of which are stored in the /usr/include/sys/trchkid.h file, and the trace formatting templates, which are stored in the /etc/trcfmt file, are shared by all the trace channels.