Administration Guide
The Problem Management subsystem (pman) provides an infrastructure
for recognizing and acting on problem events in your SP system. This
infrastructure is based on an Event Management application that provides
configurable access to Event Management client and resource monitor function
without the necessity of writing C programs that use the Event Management
APIs.
For more information, see the book RS/6000 SP High Availability
Infrastructure.
- Note:
- |Enabling restricted root access and setting authorization for AIX
|remote commands to none might limit Problem Management. See Authorizing event response actions.
|
To understand the information presented in this chapter, you need to be
familiar with Event Management concepts and terminology. The following
are some important Event Management terms:
- event
- In Event Management, the notification that an expression evaluated to
true. This evaluation occurs each time an instance of a resource
variable is observed.
- expression
- In Event Management, the relational expression between a resource variable
and other elements (such as constants or the previous value of an instance of
the variable) that, when true, generates an event. An example of an
expression is X < 10 where X represents the resource variable
IBM.PSSP.aixos.PagSp.%totalfree (the
percentage of total free paging space). When the expression is true,
that is, when the total free paging space is observed to be less than 10%, the
Event Management subsystem generates an event to notify the appropriate
application.
- instance vector
- An obsolete term replaced with the term resource
identifier.
- predicate
- An obsolete term replaced with the term expression.
- rearm expression
- In Event Management, an expression used to generate an event that
alternates with an original event expression in the following way: the
event expression is used until it is true, then the rearm expression is used
until it is true, then the event expression is used, and so on. The
rearm expression is commonly the inverse of the event expression (for example,
a resource variable is on or off). It can also be used with the event
expression to define an upper and lower boundary for a condition of
interest.
- rearm predicate
- An obsolete term replaced with the term rearm expression.
- resource
- In Event Management, an entity in the system that provides a set of
services. Examples of resources include hardware entities such as
processors, disk drives, memory, and adapters, and software entities such as
database applications, processes, and file systems. Each resource in
the system has one or more attributes that define the state of the
resource.
- resource identifier
- In Event Management, a set of elements, where each element is a name/value
pair of the form name=value, whose values uniquely identify the
copy of the resource (and by extension, the copy of the resource variable) in
the system.
- resource monitor
- A program that supplies information about the resources in the
system. It can be a command, a daemon, or part of an application or
subsystem that manages any type of system resource.
- resource variable
- In Event Management, the representation of an attribute of a
resource. An example of a resource variable is
IBM.AIX.PagSp.%totalfree, which represents the
percentage of total free paging space.
IBM.AIX.PagSp specifies the resource name and
%totalfree specifies the resource attribute.
For more information on Event Management services, see the book
RSCT: Event Management Programming Guide and
Reference.
The major components of the Problem Management subsystem are:
- A Problem Management daemon (pmand) that provides access to
events generated by Event Management on all nodes. In addition, the
following function is provided to work in conjunction with
pmand:
- A command, pmandef, that provides for subscribing to Event
Management events and associating actions for those events
- A command, pmanquery, that queries the SDR for a description of a
Problem Management subscription
- A script, pmandefaults, that causes pmand to register
for a set of default events
- A script, notify_event, that mails an event notification when an
event occurs
- A script, log_event, that logs a record of an event to a regular
wraparound file
- A resource monitor daemon, pmanrmd, that provides resource
variables to Event Management. The following function is provided to
work in conjunction with pmanrmd:
- Sixteen resource variables. The resource variables are named
IBM.PSSP.pm.User_state1 through
IBM.PSSP.pm.User_state16.
- A command, pmanrminput, that provides for associating values with
the sixteen supplied resource variables.
- A sample file, pmanrmd.conf, for configuring the
pmanrmd daemon.
- A command, pmanrmdloadSDR, for loading the configuration
information into the System Data Repository.
This chapter provides information on each of the components of Problem
Management.
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]