[ Bottom of Page | Previous Page | Next Page | Contents | Index | Library Home | Legal | Search ]

Understanding the Diagnostic Subsystem for AIX

Periodic Diagnostics

Periodic testing of the disk drives and battery are enabled by default. The disk diagnostics perform disk error log analysis on all disks. The battery test checks the real time clock and NV-RAM battery.

Periodic diagnostics are performed in different ways, depending on the diagnostic version. Use the Periodic Diagnostics task to change the test times or to add other resources to the list.

Processors that are dynamically removed from the system will also be removed from the periodic test list. Processors that are dynamically added are automatically added to the periodic test list.

AIX Version 3

Periodic testing of the disk drives and battery are performed by a root crontab entry. One entry in the root crontab table runs disk diagnostics at 3:01 a.m. each day. Another entry tests the battery at 4:01 a.m. each day. These tests can be disabled by editing the root crontab file. The disk entry is /etc/lpp/diagnostics/bin/run_ela while the battery entry is /etc/lpp/diagnostics/bin/test_batt.

Problems are reported by a message to the system console and logged in the error log. Diagnostics must be run for a SRN to be reported.

Running diagnostics in this mode is similar to using thediag -c -e -d "device" command.

AIX Version 4

Periodic testing is controlled by the Periodic Diagnostic Service Aid. The Periodic Diagnostic Service Aid allows error log analysis to be run on a hardware resource once a day. The battery and all disk drives are enabled to run. Error log analysis is performed on all the disk drives at 3:00 a.m. each day.

Other devices as necessary can be added into the Periodic Diagnostic Device list to run at various other times, if desired.

Problems are reported by a message to the system console and a mail message to all users of the system group. The message contains the SRN.

Running diagnostics in this mode for planar and memory tests is similar to using the diag -c -d "device" command. All other devices are invoked with the '-e' flag appended.

Technical Description

The Diagnostic daemon diagd executes once the bos.diag diagnostic package is installed. The diagd looks for customized entries in CDiagAtt odm database to determine which devices to run at which times. (For AIX 4.1, the database is CDiagDev.) The database is built when diagnostics are run or the Periodic Diagnostic Service Aid is run to change run times for devices. If the database has no entries (for example, when diagnostics have never been run), then default times are given to the ioplanar battery test and disk drives. The following is an example of CDiagAtt entries.

CDiagAtt->attribute = p_test_time
CDiagAtt->value = 9999 Do not test
        = 0400 Test at 4AM

The diagd sets a timer to wake up at the next scheduled time to run. Once diagd wakes up, the script /usr/lpp/diagnostics/bin/diagela is executed with the -t flag.

diagela checks the PDiagAtt->test_mode bit for the device to determine whether that device should be tested in this mode. If the bit is not set, diagela does not test the device. If the bit is set, diagnostics are run on the device with the -e (ELA) flag set.

[ Top of Page | Previous Page | Next Page | Contents | Index | Library Home | Legal | Search ]