Diagnosis Guide
Whether you decide to monitor system condition using SP Perspectives or
Problem Management, the sections that follow provide a list of the
minimum hardware and software conditions for you to monitor.
See Descriptions of each condition for a detailed description of each condition.
For each frame, switch or node, monitor the following hardware
conditions:
- For a frame:
-
Power
- Controller responding
-
Controller ID mismatch (only applies to HACWS)
- Temperature
- Node slot failures
- For a switch:
-
Power and power availability
- Environment light
- Temperature
- For a node:
-
Power and power availability
- Environment light
- Temperature
- Keymode switch
- LED/LCD contents
- LED/LCD flashing
- Node responding
- processorsOffline
-
For the control workstation:
Same as for a node
For each node, monitor the following software conditions:
- On each node:
- Node can be reached by RSCT
-
/tmp becoming full
-
/var becoming full
-
/ becoming full
-
Paging space low
-
Rising mbuf failures
-
Switch I/O errors
-
inetd daemon activity
-
srcmstr daemon activity
-
ftpd daemon activity
-
biod daemon activity - applies only to NFS systems
-
portmap daemon activity - used by RPC
-
xntpd daemon activity (NTP time synch)
-
httpd daemon activity (applies only to HTTP servers)
-
hatsd daemon activity (cannot check using Event Management)
-
hadsd daemon activity (cannot check using Event Management)
-
haemd daemon activity (cannot check using Event Management)
-
cdsadv daemon activity (DCE)
-
dced daemon activity (DCE)
-
On the control workstation:
- All conditions to monitor on each node
-
sdrd daemon active
-
kerberos daemon activity
-
secd daemon activity (DCE)
-
cdsd daemon activity (DCE)
The DCE daemons that are listed (cdsadv, dced,
secd, cdsd) are the minimum set that are required when using
DCE on the SP system. You may have other DCE daemons running that you
wish to monitor. For more information, consult IBM DCE for AIX,
Version 3.1: Administration Guide - Core Components.
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]