Error-logging tasks and information to assist you in using the error logging facility include:
To obtain a report of all errors logged in the 24 hours prior to the failure, type:
errpt -a -s mmddhhmmyy | pg
where mmddhhmmyy represents the month, day, hour, minute, and year 24 hours prior to the failure.
An error-log report contains the following information:
Reporting can be turned off for some errors. To show which errors have reporting turned off, type:
errpt -t -F report=0 | pg
If reporting is turned off for any errors, enable reporting of all errors using the errupdate command.
Logging may also have been turned off for some errors. To show which errors have logging turned off, type:
errpt -t -F log=0 | pg
If logging is turned off for any errors, enable logging for all errors using the errupdate command. Logging all errors is useful if it becomes necessary to re-create a system error.
The following are sample error-report entries that are generated by issuing the errpt -a command.
An error-class value of H and an error-type value of PERM indicate that the system encountered a hardware problem (for example, with a SCSI adapter device driver) and could not recover from it. Diagnostic information might be associated with this type of error. If so, it displays at the end of the error listing, as illustrated in the following example of a problem encountered with a device driver:
LABEL: SCSI_ERR1
ID: 0502F666
Date/Time: Jun 19 22:29:51
Sequence Number: 95
Machine ID: 123456789012
Node ID: host1
Class: H
Type: PERM
Resource Name: scsi0
Resource Class: adapter
Resource Type: hscsi
Location: 00-08
VPD:
Device Driver Level.........00
Diagnostic Level............00
Displayable Message.........SCSI
EC Level....................C25928
FRU Number..................30F8834
Manufacturer................IBM97F
Part Number.................59F4566
Serial Number...............00002849
ROS Level and ID............24
Read/Write Register Ptr.....0120
Description
ADAPTER ERROR
Probable Causes
ADAPTER HARDWARE CABLE
CABLE TERMINATOR DEVICE
Failure Causes
ADAPTER
CABLE LOOSE OR DEFECTIVE
Recommended Actions
PERFORM PROBLEM DETERMINATION PROCEDURES
CHECK CABLE AND ITS CONNECTIONS
Detail Data
SENSE DATA
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
Diagnostic Log sequence number: 153
Resource Tested: scsi0
Resource Description: SCSI I/O Controller
Location: 00-08
SRN: 889-191
Description: Error log analysis indicates hardware failure.
Probable FRUs:
SCSI Bus FRU: n/a 00-08
Fan Assembly
SCSI2 FRU: 30F8834 00-08
SCSI I/O Controller
An error-class value of H and an error-type value of PEND indicate that a piece of hardware (the Token Ring) may become unavailable soon due to numerous errors detected by the system.
LABEL: TOK_ESERR
ID: AF1621E8
Date/Time: Jun 20 11:28:11
Sequence Number: 17262
Machine Id: 123456789012
Node Id: host1
Class: H
Type: PEND
Resource Name: TokenRing
Resource Class: tok0
Resource Type: Adapter
Location: TokenRing
Description
EXCESSIVE TOKEN-RING ERRORS
Probable Causes
TOKEN-RING FAULT DOMAIN
Failure Causes
TOKEN-RING FAULT DOMAIN
Recommended Actions
REVIEW LINK CONFIGURATION DETAIL DATA
CONTACT TOKEN-RING ADMINISTRATOR RESPONSIBLE FOR THIS LAN
Detail Data
SENSE DATA
0ACA 0032 A440 0001 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 2080 0000 0000 0010 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 78CC 0000 0000 0005 C88F 0304 F4E0 0000 1000 5A4F 5685
1000 5A4F 5685 3030 3030 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000
An error-class value of S and an error-type value of PERM indicate that the system encountered a problem with software and could not recover from it.
LABEL: DSI_PROC
ID: 20FAED7F
Date/Time: Jun 28 23:40:14
Sequence Number: 20136
Machine Id: 123456789012
Node Id: 123456789012
Class: S
Type: PERM
Resource Name: SYSVMM
Description
Data Storage Interrupt, Processor
Probable Causes
SOFTWARE PROGRAM
Failure Causes
SOFTWARE PROGRAM
Recommended Actions
IF PROBLEM PERSISTS THEN DO THE FOLLOWING
CONTACT APPROPRIATE SERVICE REPRESENTATIVE
Detail Data
Data Storage Interrupt Status Register
4000 0000
Data Storage Interrupt Address Register
0000 9112
Segment Register, SEGREG
D000 1018
EXVAL
0000 0005
An error-class value of S and an error-type value of TEMP indicate that the system encountered a problem with software. After several attempts, the system was able to recover from the problem.
LABEL: SCSI_ERR6
ID: 52DB7218
Date/Time: Jun 28 23:21:11
Sequence Number: 20114
Machine Id: 123456789012
Node Id: host1
Class: S
Type: INFO
Resource Name: scsi0
Description
SOFTWARE PROGRAM ERROR
Probable Causes
SOFTWARE PROGRAM
Failure Causes
SOFTWARE PROGRAM
Recommended Actions
IF PROBLEM PERSISTS THEN DO THE FOLLOWING
CONTACT APPROPRIATE SERVICE REPRESENTATIVE
Detail Data
SENSE DATA
0000 0000 0000 0000 0000 0011 0000 0008 000E 0900 0000 0000 FFFF
FFFE 4000 1C1F 01A9 09C4 0000 000F 0000 0000 0000 0000 FFFF FFFF
0325 0018 0040 1500 0000 0000 0000 0000 0000 0000 0000 0000 0800
0000 0100 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000
An error class value of O indicates that an informational message has been logged.
LABEL: OPMSG
ID: AA8AB241
Date/Time: Jul 16 03:02:02
Sequence Number: 26042
Machine Id: 123456789012
Node Id: host1
Class: O
Type: INFO
Resource Name: OPERATOR
Description
OPERATOR NOTIFICATION
User Causes
errlogger COMMAND
Recommended Actions
REVIEW DETAILED DATA
Detail Data
MESSAGE FROM errlogger COMMAND
hdisk1 : Error log analysis indicates a hardware failure.
The following is an example of a summary error report generated using the errpt command. One line of information is returned for each error entry.
ERROR_ IDENTIFIER TIMESTAMP T CL RESOURCE_NAME ERROR_DESCRIPTION 192AC071 0101000070 I 0 errdemon Error logging turned off 0E017ED1 0405131090 P H mem2 Memory failure 9DBCFDEE 0101000070 I 0 errdemon Error logging turned on 038F2580 0405131090 U H scdisk0 UNDETERMINED ERROR AA8AB241 0405130990 I O OPERATOR OPERATOR NOTIFICATION
To create an error report of software or hardware problems do the following:
errpt -aThe errpt command generates an error report from entries in the system error log.
If the error log does not contain entries, error logging has been turned off. Activate the facility by typing:
/usr/lib/errdemon
The errdemon daemon starts error logging and writes error log entries in the system error log. If the daemon is not running, errors are not logged.
errpt -N hdisk1
smit errpt
This procedure describes how to stop the error-logging facility.
To turn off error logging, use the errstop command. You must have root user authority to use this command.
Ordinarily, you would not want to turn off the error-logging facility. Instead, you should clean the error log of old or unnecessary entries. For instructions about cleaning the error log, refer to Cleaning an Error Log.
Turn off the error-logging facility when you are installing or experimenting with new software or hardware. This way the error logging daemon does not use CPU time to log problems you know you are causing.
Error-log cleaning is normally done for you as part of the daily cron command. If it is not done automatically, clean the error log yourself every couple of days after you have examined the contents to make sure there are no significant errors.
You can also clean up specific errors. For example, if you get a new disk and you do not want the old disk's errors in the log, you can clean just the old disk's errors.
Delete all entries in your error log by doing either of the following:
errclear -d S 0The errclear command deletes entries from the error log that are older than a specified number of days. The 0 in the previous example indicates that you want to delete entries for all days.
smit errclear
Copy an error log by doing one of the following:
ls /var/adm/ras/errlog | backup -ivp
ls /var/adm/ras/errlog | backup -ivpf/dev/rmt0
snap -a -o /dev/rfd0
The snap command in this example uses the -a flag to gather all information about your system configuration. The -o flag copies the compressed tar file to the device you name. The /dev/rfd0 names your disk drive.
To gather all configuration information in a tar file and copy it to tape, type:
snap -a -o /dev/rmt0
The /dev/rmt0 names your tape drive.
The liberrlog services allow you to read entries from an error log, and provide a limited update capability. They are especially useful from an error notification method written in the C programming language, rather than a shell script. Accessing the error log using the liberrlog functions is much more efficient than using the errpt command.
The services are errlog_open, errlog_close, errlog_find_first, errlog_find_next, errlog_find_sequence, errlog_set_direction, and errlog_write.