The best way of streamlining your problem resolution is to prevent problems from occurring. To minimize the frequency and impact of problems, follow the configuration recommendations in IBM RS/6000 SP: Planning, Volume 1, Hardware and Physical Environment and IBM RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment, and use the tools documented in PSSP: Administration Guide. You should also follow the recommendations documented for any software you install.
However, problems may still occur. When they do, the best way to resolve these problems is to detect them as soon as they occur, and correct or bypass them before they impact the ability of other subsystems, causing secondary and tertiary failures. Several methods exist for detecting problems on the SP system.
The SP system provides the capability to detect problem situations in a runtime fashion when the system administrator is actively monitoring system conditions. The SP system also has asynchronous notification methods for use when the system administrator is not directly monitoring system conditions.