IBM Books

Administration Guide


Managing the AIX error log facility

For detailed information on the AIX Error Log facility, refer to:

You can perform the following tasks using either SMIT menus or commands:

When you run commands related to AIX error log management you can specify that the command is to be run on all nodes in the current system partition or you can specify the node names or the name of a file that contains a list of node names. By default, the command is run on the local node.

From SMIT

To access the AIX Error Log SMIT menu, enter:

smit sperrlog

Generating AIX error log reports

You can generate reports on entries in the AIX error log on a number of nodes. The report can be displayed or written to a file on the local node.

From SMIT

The fastpath invocation for the Generate an Error Report menu is:

smit perrpt

Trimming AIX error logs

You can trim records from error logs on a set of nodes.

From SMIT

The fastpath invocation for the Clean the Error Log menu is:

smit perrclear

Configuring the AIX error log

You can display the configuration parameters of the AIX Error Log to the local node.

From SMIT

The SMIT fastpath invocation for the Show Characteristics of the Error Log menu is:

smit perrdemon_shw

You can alter one or more of the configuration parameters for the AIX Error Log on a set of nodes. Because of the additional entries generated by SP system software, you should set the AIX Error Log file size to be a minimum of 4MB.

From SMIT

The fastpath invocation for the Change Characteristics of the Error Log menu is:

smit perrdemon_chg

Managing error notification objects

Error Notification Objects are ODM objects held in the class errnotify that are used by the AIX Error Notification Facility to invoke methods upon occurrence of an error event. Fields in the errnotify class match to fields in an Error Template for selection. If an error is logged matching the selection criteria defined in a notification object, the method associated with that object is invoked. For more information on using the AIX Error Notification Facility, refer to IBM General Concepts and Procedures for RS/6000.

You can add, remove, and show notification objects in parallel on the SP system.

From SMIT

From the command line

To add, remove, or show error notification objects in parallel on the SP system, enter:

penotify -f show

For complete details on the command, refer to the PSSP: Command and Technical Reference.

Following is a description of the actions taken by the notification method EN_pend located under the /spdata/sys1/err_methods directory. This method can be installed and used to invoke pre- and post- action scripts and mail a report of the logged error. This script provides a suggested structure for notification methods and can be reused with different pre- or post- action scripts as described in the sections.

EN_pend method flow

  1. EN_pend looks under the directory it resides in for a file with the same name and a .envs suffix. If found, it sources the file to pick up environment variables EN_RUNDEFAULT and EN_MAILLOC. An EN_pend.envs script is installed under the same directory.
  2. EN_pend.envs sets the EN_RUNDEFAULT environment variable. It also sets the EN_MAILLOC to root at the control workstation, if possible, or to root at the local node.
  3. EN_pend checks for a pre-action script under the same directory and name with a .pre suffix and executes it if found.
  4. EN_pend mails an expanded report of the error using the sequence of the error passed by the notification facility to the EN_MAILLOC, if EN_RUNDEFAULT and EN_MAILLOC variables are set.
  5. EN_pend checks for a post-action script under the same directory and name with a .post suffix and executes it if found.

Installing a notification object

To add the EN_pend method to all nodes in the current partition to send a report whenever an error of type PEND (loss of availability of a device is imminent) occurs, enter:

penotify -a -n "PEND_err" -P -t "PEND" -f add \
-m '/spdata/sys1/err_methods/EN_pend $1'
 
penotify -a -n "pend_err" -P -t "pend" -f add \
-m '/spdata/sys1/err_methods/EN_pend $1'
 
penotify -a -n "Pend_err" -P -t "Pend" -f add \
-m '/spdata/sys1/err_methods/EN_pend $1'

The -P flag will cause the object to persist after the system is restarted. Three objects are added with variations on PEND because upper case is not always adhered to by all AIX LPPs and vendor functions. The $1 argument causes the Error Notification Facility to pass the error sequence number to the notify method.

The EN_pend and EN_pend.envs scripts can be used to invoke different pre- and post-action scripts for different error events by creating links to them. EN_pend looks for .envs, .pre and .post scripts under the directory it is called from, and by the same basename. For example, to use EN_pend for reporting hdisk0 errors on nodes h0, h1, h2 and h3 and perform pre- and post- action:

  1. Create .pre and .post action scripts on one of the nodes, for example, EN_hdisk0.pre, EN_hdisk0.post under the /spdata/sys1/err_methods directory on node h0.
  2. Copy the pre- and post- scripts to nodes h1, h2 and h3 using pcp:
    pcp -w h1,h2,h3 EN_hdisk0.pre /spdata/sys1/err_methods/
    pcp -w h1,h2,h3 EN_hdisk0.post /spdata/sys1/err_methods/
    
  3. Create links to the EN_pend and EN_pend.envs scripts:
    dsh -w h0,h1,h2,h3 ln -s /spdata/sys1/err_methods/EN_pend \
    /spdata/sys1/err_methods/EN_hdisk0
     
    dsh -w h0,h1,h2,h3 ln -s /spdata/sys1/err_methods/EN_pend.envs \
    /spdata/sys1/err_methods/EN_hdisk0.envs
    
  4. Add the notification object:
    penotify -w h0,h1,h2,h3 -f add -P -n "hdisk0_err" \
    -m '/spdata/sys1/err_methods/EN_hdisk0 $1' -N "hdisk0"
    

To display the notification object created, enter:

penotify -w h0,h1,h2,h3 -f show -n "hdisk0_err"

To remove this notification object, enter:

penotify -w h0,h1,h2,h3 -f remove -n "hdisk0_err"
Note:
Notification methods need to be accessible to each node that the notify object is added to. We suggest that the notification scripts be kept local on each node in case of network failure. File collections should be used to maintain updates to notification methods.

Managing error templates

You can create a new error template in the Error Template repository for logging errors.

From SMIT

The fastpath invocation for the Add an Error Template menu is:

smit padd_et

You can remove a template from the Error Template Repository.

From SMIT

The fastpath invocation for the Remove an Error Template menu is:

smit prem_et

You can display entries from the Error Template Repository to the local host.

From SMIT

The fastpath invocation for the Show an Error Template menu is:

smit pshw_et


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]