Administration Guide

Event Management procedures

For the most part the Event Management subsystem runs itself without requiring administrator intervention. However, on occasion, you may need to check the status of the subsystem, or add or change some of the configuration data.

This section contains the procedures that you need to do these tasks, which include:

Displaying the status of the Event Manager daemon
Loading non-PSSP configuration data into the SDR
Changing configuration data in the SDR
Activating the configuration data in the SDR
Changing resource variable instance limits

Displaying the status of the Event Manager daemon

You can display the operational status of the Event Manager daemon by issuing the lssrc command.

On the control workstation, enter:

lssrc -l -s haem.domain_name

where domain_name is the name of the domain of interest.

On a node, enter:

lssrc -l -s haem

In response, the lssrc command writes the status information to standard output. The information includes:

The information provided by the lssrc -s haem command (short form)
The names of any trace flags that are set
For information on these flags, see the haemtrcon command in PSSP Command and Technical Reference.
The EMCDB version string and an indication as to whether the version string is taken from the SDR or from the peer group state.
The day and time the Event Manager daemon was started.

A report on the daemon's progress through initialization:

Daemon connected to group services: TRUE/FALSE
Daemon has joined peer group:       TRUE/FALSE
Daemon communications enabled :     TRUE/FALSE

The security state of the daemon.
A count of the peer daemons that are currently in the peer group. The count does not include this daemon.
A listing of the peer group state.
The peer group state includes the EMCDB version string and the peer group security state. The peer group security state is the keyword SEC, NOSEC, or NOSECSUPPORT. The keyword might have a suffix, but it can be ignored. If the peer group is established by a pre-PSSP 3.2 version of the EM daemon, no security keyword is present.
The number and type of EM client connections. Note that when a daemon relays a request to another daemon, the sending daemon is treated as a client by the receiving daemon.
A list of each resource monitor that is defined in the EMCDB and the current status of each, as follows.
A resource monitor may have multiple executing instances, the number of the resource monitor instance is specified in the Inst column. The connection type (C=client, S=server, I=internal) is found in the Type column. The connection status is indicated by the FD column; if the file descriptor is greater than or equal to 0, a connection is open. If a resource monitor has a shared memory segment used to transfer information to the Event Manager daemon, it has a shared memory ID greater than or equal to 0 in the SHMID column. The process ID of the resource monitor is listed in the PID column; it is interpreted as follows:

ID greater than 0
Resource monitor has been successfully started by the Event Manager daemon
ID equal to 0
A resource monitor started by the daemon has terminated (or the resource monitor forked, the parent process exited and the child process is the actual resource monitor)
ID equal to -1
The resource monitor has never been started by the Event Manager daemon
ID equal to -2
The resource monitor is not startable by the Event Manager daemon.

The Locked column indicates whether or not a resource monitor is locked and the current count of start attempts and successful connections, in the form mm/nn, where mm is the count of start attempts and nn is the count of successful connections.
If a resource monitor has more than one instance, information is present in the PID and Locked columns only for instance number 0. However, the count of successful connections is for all instances of the resource monitor.
The highest file descriptor in use.
The peer daemon status.
This lists the status of peer daemons by node number, in node number order. Note that this list only includes peer daemons that have joined the peer group since the local daemon started.
Following the node number are two characters. If both characters are S, the specified node is the number of the node where this daemon is running. Otherwise, the characters can take on values as follows.
The first character is I or O where:
- I indicates that the peer on the specified node is a peer group member.
- O indicates that the peer is no longer a peer group member (but was at one time).
The second character is either A or R, where:
- A indicates that this daemon is accepting join requests from the peer on the specified node.
- R means this daemon is rejecting join requests.
A list of internal daemon counters, for use by IBM service personnel.

Loading non-PSSP configuration data into the SDR

The default configuration data supplied for the PSSP is normally loaded into the SDR and compiled into its binary format automatically by the haemctrl script. However, if resource monitors supplied by other IBM products or by third parties are installed on the SP system, you must load the configuration data supplied with the resource monitors into the SDR and activate it. To do this:

Login to the control workstation. Use an ID that has root authority.
Create a file in load list format with the data to be loaded, or identify the path name of the file that has been supplied.
If you are creating a new file, use the format specified in the man page for the haemloadlist file.
Set the SP_NAME environment variable to the appropriate system partition name.
Load the data into the SDR using the haemloadcfg command. Enter:
```
haemloadcfg new_loadlist
```
where new_loadlist is the path name of the load list you previously created or identified.
Activate all of the Event Management data in the SDR, including your new data, using the procedure in Activating the configuration data in the SDR.

Changing configuration data in the SDR

With an optional flag, the haemloadcfg command can replace existing objects (identified by their key attributes) in the SDR. If you want to change an object that already exists, do the following:

Login to the control workstation. Use an ID that has root authority.
Change the object's attribute in the load list file.
For PSSP configuration data, the default configuration data is in the /usr/sbin/rsct/install/config/haemloadlist file. You must copy this file to another file and make the changes in the copy. If you have non-PSSP or third-party resource monitors, the object will be loaded from another load list file.
Do this step for as many objects as you are changing.
Set the SP_NAME environment variable to the appropriate system partition name.
Run the haemloadcfg command, specifying the -r flag and the changed load list file.
In response, the haemloadcfg command replaces each object in the SDR matched by an object in the load list file.
Activate all of the Event Management data in the SDR, including your changed data, using the procedure in Activating the configuration data in the SDR.

Activating the configuration data in the SDR

When you have added or changed the Event Management data in the SDR, you must activate it by recompiling the EMCDB and stopping and restarting the Event Manager daemons. To do this:

Login to the control workstation. Use an ID that has root authority.
Set the SP_NAME environment variable to the appropriate system partition name.
Recompile the EMCDB. Enter:
```
haemcfg
```
The output is placed in the staging directory in a file whose name indicates the system partition.
Stop the Event Manager daemons in this system partition.
Issue the haemctrl -k command on the control workstation and on each of the nodes in the system partition.
You can use the dsh or Sysctl commands to run the command on multiple nodes from the control workstation. For more information on using these commands, see Using dsh to run parallel management commands.
Verify that all of the Event Manager daemons in this system partition have stopped.
On the control workstation, issue the lssrc -s haem.domain_name command. On the nodes, issue the lssrc -s haem command.
You can use the dsh or Sysctl commands to run the command on multiple nodes from the control workstation. For more information on using these commands, see Using dsh to run parallel management commands.
The status of each daemon should indicate that it is inactive.
Restart the Event Manager daemons in this system partition.
Issue the haemctrl -s command on the control workstation and on each of the nodes in the system partition.
You can use the dsh or Sysctl commands to run the command on multiple nodes from the control workstation. For more information on using these commands, see Using dsh to run parallel management commands.

Changing resource variable instance limits

To change the limit on the number of resource variable instances accepted by the Event Management subsystem for any resource variable class, do the following:

Login to the control workstation. Use an ID that has root authority.
Set the SP_NAME environment variable to the appropriate system partition name.
Find the load list file that contains the resource variable class definition. Resource variable classes shipped for the PSSP are found in /usr/sbin/rsct/install/config/haemloadlist.

Find the class definition in the load list file. For example, the class definition of the Recoverable Virtual Shared Disk subsystem is:

EM_Resource_Class
        rcClass="IBM.PSSP.VSD"
        rcResource_monitor="IBM.PSSP.harmld"
        rcObservation_interval="60"
        rcReporting_interval="10"

Either modify the definition in the load list file or copy the definition to a new load list file (which will contain only the modified definition. This is required if the load list file is /usr/sbin/rsct/install/config/haemloadlist). Modify the definition by adding the attribute rcInstance_limit set to the desired value. For example, the Recoverable Virtual Shared Disk class definition would be changed to:
```
EM_Resource_Class
        rcClass="IBM.PSSP.VSD"
        rcResource_monitor="IBM.PSSP.harmld"
        rcObservation_interval="60"
        rcReporting_interval="10"
        rcInstance_limit="5000"
```
to limit the number of Recoverable Virtual Shared Disk resource variable instances to 5000.
Execute the haemloadcfg command, specifying the -r flag and the name of the modified load list file.
Now, follow the procedure documented in Activating the configuration data in the SDR.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]