The Reliability, Availability, and Serviceability (RAS) kernel services are used to record the occurrence of hardware or software failures and to capture data about these failures. The recorded information can be examined using the errpt or trcrpt commands.
The panic kernel service is called when a catastrophic failure occurs and the system can no longer operate. The panic service performs a system dump. The system dump captures data areas that are registered in the Master Dump Table. The kernel and kernel extensions use the dmp_ctl kernel service to add and delete entries in the Master Dump Table, and record dump routine failures.
The errsave and errlast kernel service is called to record an entry in the system error log when a hardware or software failure is detected.
The trcgenk and trcgenkt kernel services are used along with the trchook subroutine to record selected system events in the event-tracing facility.
The register_HA_handler and unregister_HA_handler kernel services are used to register high availability event handlers for kernel extensions that need to be aware of events such as processor deallocation.
The RAS kernel services are:
dmp_ctl | Adds and removes entries to the master dump table. |
dmp_prinit | Initializes the remote dump protocol. |
errsave and errlast | Allows the kernel and kernel extensions to write to the error log. |
panic | Crashes the system. |
register_HA_handler | Registers a High Availability Event Handler |
trcgenk | Records a trace event for a generic trace channel. |
trcgenkt | Records a trace event, including a time stamp, for a generic trace channel. |
unregister_HA_handler | Cancels the registration of a High Availability Event Handler |