[ Previous | Next | Contents | Home | Search ]
AIX Version 4.3 Kernel Extensions and Device Support Programming Concepts

System Dump

The system dump copies selected kernel structures to the dump when an unexpected system halt occurs, when the reset button is pressed, or when the special system dump key sequences are entered. You can also initiate a system dump through the System Management Interface Tool (SMIT). For more information, see "Start a System Dump" in AIX Version 4.3 Problem Solving Guide and Reference.

The dump device can be dynamically configured, which means that either the tape or logical volumes on hard disk can be used to receive the system dump. Use the sysdumpdev command to dynamically configure the dump device.

You can also define primary and secondary dump devices. A primary dump device is a dedicated dump device, while a secondary dump device is shared.

The system kernel dump routine contains all the vital structures of the running system, such as the process table, the kernel's global memory segment, and the data and stack segment of each process.

Be sure to refer to the system header files in the /usr/include/sys directory. The name of the file tells which structure and associated information it contains. For example, the user block is defined in sys/user.h. The process block is defined in sys/proc.h.

When you examine system data that maps into these structures, you can gain valuable kernel information that can explain why the dump was called.

Initiating a System Dump

A system dump initiated by a kernel panic is written to the primary dump device. If you initiate a system dump by pressing the reset button, the system dump is written to the primary dump device.

Use the special key sequences to determine whether the write of a system dump goes to the primary dump device or to the secondary dump device. To write to the primary dump device, use the sequence Ctrl-Alt-NumPad1. To write to the secondary dump device, use the sequence Ctrl-Alt-NumPad2.

To use SMIT, select Problem Determination from the main menu, then select System Dump. This presents a menu that allows you to initiate a system dump to either the primary or secondary device, and manipulate the dump devices and the system dump files.

If you prefer to initiate the system dump from the command line, use the sysdumpstart command. Use the -p flag to write to the primary device or the -s flag to write to the secondary device.

If you want your device to be a primary or secondary device, the driver must contain a dddump routine.

When the system dump completes, the system either halts or reboots, depending upon the setting of the autorestart attribute of sys0 . This can be shown and altered using SMIT by selecting System Environments, then Change / Show Characteristics of Operating System. The Automatically REBOOT system after a crash item shows and sets this value.

Including Device Driver Information in a System Dump

The system dump is table driven. The two parts of the table are:

master dump table
                          Contains a pointer to a function which is provided by the device driver. The function is called by the kernel dump routine when a system dump occurs. The function must return a pointer to a component dump table.
component dump table
                          Specifies memory areas to be included in a system dump.

Both the master dump table and the component dump table must reside in pinned global memory.

When a dump occurs, the kernel dump routine calls the function pointed to in the master dump table twice. On the first call, an argument of 1 indicates that the kernel dump routine is starting to dump the data specified by the component dump table.

On the second call, an argument of 2 indicates that the kernel dump routine has finished dumping the data specified by the component dump table. The component dump table should be allocated and pinned during initialization. The entries in the component dump table can be filled in later. The function pointed to in the master dump table must not attempt to allocate memory when it is called. The System Dump Flow figure shows the flow of a system dump.

To have your device driver data areas included in a system dump, you must register the data areas in the master dump table. Use the dmp_add kernel service to add an entry to the master dump table. Conversely, use the dmp_del kernel service to delete an entry from the master dump table. The syntax is as follows:

#include <sys/types.h>
#include <sys/errno.h>
#include <sys/dump.h>
   
int dmp_add(cdt_func) or int dmp_del(cdt_func)
int cdt * ((*cdt_func) ());

The cdt structure is defined in the sys/dump.h header file. A cdt structure consists of a fixed-length header (cdt_head structure) and an array of one or more cdt_entry structures.

The cdt_head structure contains a component name field, containing the name of the device driver, and the length of the component dump table. Each cdt_entry structure describes a contiguous data area, giving a pointer to the data area, its length, a segment register, and a name for the data area. Use the name supplied for the data area to refer to it when the crash command formats the dump. The Kernel Dump Image figure illustrates a dump image.

Formatting a System Dump

Each device driver that includes data in a system dump can install a unique formatting routine in the /usr/lib/ras/dmprtns directory. A formatting routine is a command that is called by the crash command. The name of the formatting routine must match the component name field of the corresponding component dump table.

The crash command forks a child process that runs the formatting routines. If a formatting routine is not provided for a component name, the crash command runs the _default_dmp_fmt default-formatting routine, which prints out the data areas in hex.

The crash command calls the formatting routine as a command, passing the file descriptor of the open dump image file as a command line argument. The syntax for this argument is -ffile_descriptor.

The dump image file includes a copy of each component dump table used to dump memory. Before calling a formatting routine, the crash command positions the file pointer for the dump image file to the beginning of the relevant component dump table copy.

The dumped memory is laid out in the dump image file with the component dump table and is followed by a bitmap for the first data area, then the first data area itself. A bitmap for the next data area follows, then the next data area itself, and so on.

The bitmap for a given data area indicates which pages of the data area are actually present in the dump image and which are not. Pages that were not in memory when the dump occurred were not dumped. The least significant bit of the first byte of the bitmap is set to 1 if the first page is present. The next least significant bit indicates the presence or absence of the second page, and so on. A macro for determining the size of a bitmap is provided in sys/dump.h.

Note: A sample dump formatter is shipped with bos.sysmgt.serve_aid in the /usr/samples/dumpfmt directory.

[ Previous | Next | Contents | Home | Search ]