Purpose
haemctrl - A control script that starts the Event Management subsystem.
Syntax
haemctrl {-a | -s | -k | -d | -c | -u | -t | -o | -r | -h}
Flags
Operands
None.
Description
Event Management is a distributed subsystem of RSCT that provides a set of high availability services for the IBM RS/6000 SP. By matching information about the state of system resources with information about resource conditions that are of interest to client programs, it creates events. Client programs can use events to detect and recover from system problems, thus enhancing the availability of the SP system.
The haemctrl control script controls the operation of the Event Management subsystem. The subsystem is under the control of the System Resource Controller (SRC) and belongs to a subsystem group called haem. Associated with each subsystem is a daemon.
The haemctrl script also controls the operation of the AIX Resource Monitor subsystem. The subsystem is under SRC control and also belongs to the haem subsystem group. Associated with each subsystem is a daemon.
Instances of the Event Management and AIX Resource Monitor subsystems execute on the control workstation and on every node of a system partition. Because Event Management provides its services within the scope of a system partition, these subsystems are said to be system partition-sensitive. This control script operates in a manner similar to the control scripts of other system partition-sensitive subsystems. It can be issued from either the control workstation or any of the system partition's nodes.
From an operational point of view, the Event Management subsystem group is organized as follows:
The haem subsystem is associated with the haemd daemon.
The subsystem name on the nodes is haem. There is one of each subsystem per node and it is associated with the system partition to which the node belongs.
On the control workstation, there are multiple instances of each subsystem, one for each system partition. Accordingly, the subsystem names on the control workstation have the system partition name appended to them. For example, for system partitions named sp_prod and sp_test, the subsystems on the control workstation are named haem.sp_prod and haem.sp_test.
haemaixos
The haemaixos subsystem is associated with the harmad daemon.
The subsystem name on the nodes is haemaixos. There is one of each subsystem per node and it is associated with the system partition to which the node belongs.
On the control workstation, there are multiple instances of each subsystem, one for each system partition. Accordingly, the subsystem names on the control workstation have the system partition name appended to them. For example, for system partitions named sp_prod and sp_test, the subsystems on the control workstation are named haemaixos.sp_prod and haemaixos.sp_test.
The haemd daemon provides the Event Management services. The harmad daemon is the resource monitor for AIX operating system resources.
The haemctrl script is not normally executed from the command line. It is normally called by the syspar_ctrl command during installation of the system, and partitioning or repartitioning of the system.
The haemctrl script provides a variety of controls for operating the Event Management subsystem:
Before performing any of these functions, the script obtains the current system partition name and IP address (using the spget_syspar command) and the node number (using the node_number) command. If the node number is zero, the control script is running on the control workstation.
Except for the clean and unconfigure functions, all functions are performed within the scope of the current system partition.
Adding the Subsystem
When the -a flag is specified, the control script uses the mkssys command to add the Event Management and AIX Resource Monitor subsystems to the SRC. The control script operates as follows:
The service name that is entered in the /etc/services file is haem.syspar_name.
For more information about configuring Event Management data, see the IBM RS/6000 Cluster Technology: Event Management Programming Guide and Reference.
Then it gets the port number for the subsystem from the SP_ports class of the System Data Repository (SDR) and ensures that the port number is set in the /etc/services file. This port number is used for remote connections to Event Management daemons that are running on the control workstation. If there is no port number in the SDR, the script obtains one and sets it in the /etc/services file. The range of valid port numbers is 10000 to 10100, inclusive.
The service name is haemd.
Starting the Subsystem
When the -s flag is specified, the control script uses the startsrc command to start the Event Management subsystem, haem, and the AIX Resource Monitor subsystem, haemaixos.
Stopping the Subsystem
When the -k flag is specified, the control script uses the stopsrc command to stop the Event Management subsystem, haem, and the AIX Resource Monitor subsystem, haemaixos.
Deleting the Subsystem
When the -d flag is specified, the control script uses the rmssys command to remove the Event Management and AIX Resource Monitor subsystems from the SRC. The control script operates as follows:
Cleaning Up the Subsystems
When the -c flag is specified, the control script stops and removes the Event Management subsystems for all system partitions from the SRC. The control script operates as follows:
Unconfiguring the Subsystems
When the -u flag is specified, the control script performs the function of the -c flag in all system partitions and then removes all port numbers from the SDR allocated by the Event Management subsystems.
Prior to executing the haemctrl command with the -u flag on the control workstation, the haemctrl command with the -c flag must be executed from all of the nodes. If this subsystem is not successfully cleaned from all of the nodes, different port numbers may be used by this subsystem, leading to undefined behavior.
Turning Tracing On
When the -t flag is specified, the control script turns tracing on for the haemd daemon, using the haemtrcon command. Tracing for the harmad daemon is also enabled, using the traceson command.
Turning Tracing Off
When the -o flag is specified, the control script turns tracing off for the haemd daemon, using the haemtrcoff command. Tracing for the harmad daemon is also disabled, using the tracesoff command.
Refreshing the Subsystem
When the -r flag is specified, the control script refreshes the subsystem using the refresh command. This results in the Event Management subsystem attempting to use the current SP Trusted Services authentication methods. Note that this command only initiates the refresh operation. Use the lssrc -ls haem.syspar_name command on the control workstation or the lssrc -ls haem command on a node to determine the current security state of the Event Management subsystem within the system partition. See "The Event Management subsystem" chapter in the PSSP: Administration Guide for further information.
Logging
While it is running, the Event Management daemon normally provides information about its operation and errors by writing entries to the AIX error log. If it cannot, errors are written to a log file called /var/ha/log/em.default.syspar_name.
Files
Standard Error
This command writes error messages (as necessary) to standard error.
Exit Values
Security
You must have root privilege and write access to the SDR to run this command.
Implementation Specifics
This command is part of RS/6000 Cluster Technology (RSCT), which is included with the IBM Parallel System Support Programs (PSSP) Licensed Program (LP).
Prerequisite Information
"The Event Management subsystem" chapter of PSSP: Administration Guide
IBM RS/6000 Cluster Technology: Event Management Programming Guide and Reference
AIX Commands Reference
Information about the System Resource Controller (SRC) in AIX General Programming Concepts: Writing and Debugging Programs
Location
/usr/sbin/rsct/bin/haemctrl
Related Information
Commands: haemcfg, haemd, haemloadcfg, haemtrcoff, haemtrcon, lssrc, startsrc, stopsrc, syspar_ctrl
Examples
haemctrl -a
haemctrl -s
haemctrl -k
haemctrl -d
haemctrl -c
haemctrl -u
haemctrl -t
haemctrl -o
lssrc -g haem
lssrc -s haem
To display the status of an individual Event Management subsystem on the control workstation, enter:
lssrc -s haem.syspar_name
where syspar_name is the system partition name.
lssrc -l -s haem
To display detailed status about an individual Event Management subsystem on the control workstation, enter:
lssrc -l -s haem.syspar_name
where syspar_name is the system partition name.
In response, the system returns information that includes the running status of the subsystem, the settings of trace flags, the version number of the Event Management Configuration Database, the time the subsystem was started, the connection status to Group Services and peer Event Management subsystem, and the connection status to Event Management clients, if any.
lssrc -a
Purpose
haemd - The Event Manager daemon, which observes resource variable instances that are updated by Resource Monitors and generates and reports events to client programs.
Syntax
haemd
Flags
No specifiable flags.
Operands
No specifiable operands.
Description
The haemd daemon is the Event Manager daemon. The daemon observes resource variable instances that are updated by Resource Monitors and generates and reports events to client programs.
One instance of the haemd daemon executes on the control workstation for each system partition. An instance of the haemd daemon also executes on every node of a system partition. The haemd daemon is under System Resource Controller (SRC) control.
Because the daemon is under SRC control, it cannot be started directly from the command line. It is normally started by the haemctrl command, which is in turn called by the syspar_ctrl command during installation of the system, and partitioning or repartitioning of the system. If you must start or stop the daemon directly, use the haemctrl command.
When SRC creates the haemd daemon, the actual program started is haemd_SP. The haemd_SP program, after collecting information needed by the daemon, then executes the haemd program. In other words, the haemd_SP program is replaced by the haemd program in the process created by SRC.
For more information about the Event Manager daemon, see the haemctrl man page.
Implementation Specifics
This command is part of RS/6000 Cluster Technology (RSCT), which is included with the IBM Parallel System Support Programs (PSSP) Licensed Program (LP).
Prerequisite Information
"The Event Management subsystem" chapter of PSSP: Administration Guide
IBM RS/6000 Cluster Technology: Event Management Programming Guide and Reference
AIX Commands Reference
Information about the System Resource Controller (SRC) in AIX General Programming Concepts: Writing and Debugging Programs
Location
/usr/sbin/rsct/bin/haemd
Related Information
Commands: haemctrl and haemd_SP
Examples
See the haemctrl command.
Purpose
haemd_SP - Start-up program for the Event Manager daemon.
Syntax
haemd_SP [-T group_name] [ -d trace_arg] ... [syspar_IPaddr ]
Flags
Operands
Description
The haemd_SP program is the start-up program for the haemd daemon. When the Event Management subsystem is configured in the System Resource Controller (SRC) by the haemctrl command, haemd_SP is specified as the program to be started. The syspar_IPaddr argument is configured if necessary.
This program can only be invoked by the SRC. To start the Event Management subsystem use the haemctrl command.
The -d flag should only be used under the direction of the IBM Support Center. The possible trace arguments are the same as for the haemtrcon command, except for regs and dinsts. To use this flag the haem subsystem definition (haem.syspar_name on the control workstation) in the SRC must be changed using the chssys command with the -a argument. Then the daemon must be stopped and then restarted.
The -T flag can be used when testing a new resource monitor. Refer to "Coding and testing the Resource Monitor, Alternative Testing Method in Chapter 1 of the IBM RS/6000 Cluster Technology: Event Management Programming Guide and Reference.
Implementation Specifics
This command is part of RS/6000 Cluster Technology (RSCT), which is included with the IBM Parallel System Support Programs (PSSP) Licensed Program (LP).
Prerequisite Information
"The Event Management subsystem" chapter of PSSP: Administration Guide
IBM RS/6000 Cluster Technology: Event Management Programming Guide and Reference
AIX Commands Reference
Information about the System Resource Controller (SRC) in AIX General Programming Concepts: Writing and Debugging Programs
Location
/usr/sbin/rsct/bin/haemd_SP
Related Information
Commands: haemctrl, haemd, haemtrcon
Examples
See the haemctrl command.
Purpose
haemloadcfg - Loads Event Management configuration data into the System Data Repository (SDR).
Syntax
haemloadcfg [-d] [-r] loadlist_file
Flags
Operands
Description
The haemloadcfg utility command loads Event Management configuration data into the SDR. Note that before you invoke haemloadcfg, you must ensure that the SP_NAME environment variable is set to the appropriate system partition name.
The configuration data is contained in a load list file, whose format is described by the man page for the haemloadlist file. For details on the SDR classes and attributes that you can use to specify Event Management configuration data, see IBM RS/6000 Cluster Technology: Event Management Programming Guide and Reference.
To load the default Event Management configuration data for PSSP, specify the load list file as /usr/sbin/rsct/install/config/haemloadlist.
To add Event Management configuration data for other Resource Monitors, create a file in load list format and specify its name on the command.
Without any flags, the haemloadcfg command does not replace existing objects in the SDR. The data in the load list file is matched with the existing objects in the SDR based on key attributes, as follows:
Note that the way in which the haemloadcfg command handles existing SDR objects is different from the way in which the SDRCreateObjects command handles them. The SDRCreateObjects command creates a new object as long as the attributes, taken as a group, are unique.
To change a nonkey attribute of an Event Management object that already exists in the SDR, change the attribute in the load list file. Then run the haemloadcfg command using the -r flag and the name of the load list file. All objects in the SDR are replaced by matching objects in the load list file using the key attributes to match. Any unmatched objects in the load list file are added to the SDR.
To delete Event Management objects from the SDR, create a load list file with the objects to be deleted. Only the key attributes need to be specified. Then run the haemloadcfg command using the -d flag and the name of the load list file. All objects in the SDR that match objects in the load list file are deleted. No unmatched objects, if any in the load list file, are added to the SDR.
Under any circumstances, duplicate objects in the load list file, based on matches in key attributes, are ignored. However, such duplicate objects are written to standard output.
This release of RS/6000 Cluster Technology has changed (from PSSP release 2.4) several names in the SDR Event Management configuration data:
Old Class Name | New Class Name | ||
---|---|---|---|
| EM_Instance_Vector |
| EM_Resource_ID |
Old Attribute Name | New Attribute Name | ||
---|---|---|---|
| ivResource_name |
| riResource_name |
| ivElement_name |
| riElement_name |
| ivElement_description |
| riElement_description |
|
|
|
|
| rvPredicate |
| rvExpression |
| rvIndex_vector |
| rvIndex_element |
If there is configuration data present in the SDR from a prior release, the haemloadcfg command automatically migrates the data from the old names to the new names the first time the command is executed. After successful migration the objects in the EM_Instance_Vector class are deleted.
Note that rvExpression and rvIndex_element are added to the definition of the EM_Resource_Variable class; rvPredicate and rvIndex_vector are still present in this class but are no longer used after migration.
For compatibility the haemloadcfg command accepts load list files using the old class and attribute names.
Files
Standard Error
This command writes error messages (as necessary) to standard error.
Exit Values
Security
You must have root privilege and write access to the SDR to run this command.
You should be running on the control workstation. Before running this command, you must set the SP_NAME environment variable to the appropriate system partition name.
Implementation Specifics
This command is part of RS/6000 Cluster Technology (RSCT), which is included with the IBM Parallel System Support Programs (PSSP) Licensed Program (LP).
For a general overview of configuring Event Management, see "The Event Management subsystem" chapter of PSSP: Administration Guide.
For details on the System Data Repository classes and attributes for Event Management configuration Database, see IBM RS/6000 Cluster Technology: Event Management Programming Guide and Reference.
Location
/usr/sbin/rsct/install/bin/haemloadcfg
Related Information
Commands: haemcfg, SDRCreateObjects, SDRDeleteObjects
Files: haemloadlist
Also, for a description of the SDR classes for Event Management configuration data, see IBM RS/6000 Cluster Technology: Event Management Programming Guide and Reference.
Examples
haemloadcfg /usr/sbin/rsct/install/config/haemloadlist
haemloadcfg /usr/local/config/newrmloadlist
If nonkey attributes in this load list file are later changed, update the SDR by entering:
haemloadcfg -r /usr/local/config/newrmloadlist
If this new Resource Monitor is no longer needed, its configuration data is removed from the SDR by entering:
haemloadcfg -d /usr/local/config/newrmloadlist
Purpose
haemqvar- Queries resource variables.
Syntax
Flags
Operands
Description
The haemqvar command queries the Event Management subsystem for information about resource variables. By default, the command writes to standard output the definitions for all resource variables in the current SP domain. That is, the current SP system partition as defined by the SP_NAME environment variable. If SP_NAME is not set the default system partition is used. The -S flag can be used to specify another SP domain (system partition). To query variables in an HACMP domain, use the -H flag. For an SP domain, the domain flag argument is a system partition name. For a HACMP domain, the domain flag argument is a HACMP cluster name. When the -H flag is specified, the command must be executed on one of the nodes in the HACMP cluster.
The following information is reported for each resource variable definition:
Since the default behavior of this command can produce a large amount of output, standard output should be redirected to a file.
If the -d flag is specified only the resource variable name and a short description are written to standard output, one name and description per line.
If the -c flag is specified the current values of all resource variables instances are written to standard output, one per line. The line of output contains the location of the resource variable instance (node number), the resource variable name, the resource ID of the instance and the resource variable instance value. If the resource variable is a Structured Byte String (SBS) data type, then the value of each SBS field is reported.
The -i flag reports the same information as the -c flag except that the value of the variable instance is the last known value rather than the current value. The -i flag is useful for determining what resource variable instances exist.
For both the -c and the -i flags, if an error is encountered in obtaining information about a resource variable instance, the output line contains an error message, symbolic error codes, the location of where the error originated (if it can be determined), the resource variable name and the resource ID.
To return information about specific resource variables, specify the class, var and rsrcID operands. These operands can be repeated to specify additional resource variables. In addition, the var and rsrcID operands can be wildcarded to match a number of resource variables. Note that null string operands or an asterisk must be quoted in the shells.
If class is not a null string, then all variables in the specified class, as further limited by the var and rsrcID arguments, are targets of the query. If class is a null string, then variables of all classes, as further limited by the var and rsrcID arguments, are targets of the query.
The var argument can be wildcarded in one of two ways:
When the resource variable name is wildcarded in the first manner, then all resource variables, as further limited by the class and rsrcID arguments, are targets of the query. When the resource variable name is wildcarded in the second manner, all resource variables whose high-order (leftmost) components match the var argument, as further limited by the class and rsrcID arguments, are targets of the query.
All resource variable instances (or definitions if neither the -c nor the -i flags are specified) of the variables specified by the class and var arguments that match the rsrcID argument are the targets of the query.
If neither the -c nor the -i flags are specified, the rsrcID argument is a semicolon-separated list of resource ID element names. If either the -c or the -i flags are specified, the rsrcID argument is a semicolon-separated list of name/value pairs. A name/value pair consists of a resource ID element name followed by an equal sign followed by a value of the resource ID element. An element value may consist of a single value, a range of values, a comma-separated list of single values or a comma-separated list of ranges. A range takes the form a-b and is valid only for resource ID elements of type integer (the type information can be obtained from the variable definition). There can be no blanks in the resource ID.
A resource ID element is wildcarded by specifying its value as the asterisk character. Only variables that are defined to contain the elements, and only the elements, specified in the rsrcID argument are targets of the query. If any element of the resource ID consists of the asterisk character, rather than a name/value pair (or just a name if querying for definitions), all variables that are defined to contain at least the remaining specified elements are targets of the query. The entire resource ID is wildcarded if it consists of only the asterisk character; all instances of all resource variables, as further limited by the class and var arguments, are targets of the query.
Note that the rsrcID argument must be quoted in the shells if it contains semicolons or asterisks.
The class, var and rsrcID operands can be placed in a file, one set of operands per line, instead of being specified as command arguments. Use the -f flag to specify the name of the file to the command. If the -f flag is used, any operands to the command are ignored. Within the file, null strings are specified as two adjacent double quote characters and a completely wildcarded resource ID can either be a single asterisk or a double quoted asterisk ("*"). On each line the arguments must be separated by white space (blanks or tabs).
Following are some examples of using wildcards in the rsrcID argument:
NodeNum=5;VG=rootvg;LV=hd4 NodeNum=*;VG=rootvg;LV=hd4 NodeNum=*;VG=*;LV=* NodeNum=9 NodeNum=* NodeNum=9;VG=*;* NodeNum=*;*
For these examples, assume the class and var arguments are null strings. If either the class or var arguments or both are not null strings, targets for the query are restricted accordingly.
In the first three examples, all variables whose resource IDs are defined to contain the elements NodeNum, VG and LV, and only those elements, are matched. In the first example, only one instance is matched. In the second example, one instance from each node is matched. In the third example, all instances of the matching resource variables are matched.
In the fourth example, all variables whose resource IDs are defined to contain only the element NodeNum are matched. The instances matched are associated with node 9. In the fifth example, the same set of variables are matched, but all instances of each variable are matched.
In the sixth example, all variables whose resource IDs are defined to contain elements NodeNum and VG, as well as zero or more additional elements, are matched. The instances matched are associated with node 9. In the last example, all variables whose resource IDs are defined to contain the element NodeNum, as well as zero or more additional elements, are matched. All instances of the variables are matched.
Given the flexibility in specifying resource variables for query, it is possible that no resource variable instance or resource variable definition will match. If there is no match appropriate error information is reported, either in the form described above or as follows. If the specification of the class, var or rsrcID arguments are in error, the output line contains an error message, symbolic error codes and the specified class name, resource variable name and resource ID.
Security
You must have Event Manager access to run this command. See PSSP: Administration Guide for more information.
Implementation Specifics
This command is part of RS/6000 Cluster Technology (RSCT), which is included with the IBM Parallel System Support Programs (PSSP) Licensed Program (LP).
Location
/usr/sbin/rsct/bin/haemqvar
Related Information
IBM RS/6000 Cluster Technology: Event Management Programming Guide and Reference
"The Event Management subsystem" chapter in PSSP: Administration Guide
Examples
haemqvar > vardefs.out
haemqvar -H HAcluster -d "" "" "VG;*"
To obtain resource variables whose resource IDs contain only the elements VG and NodeNum, enter:
haemqvar -H HAcluster -d "" "" "VG;NodeNum"
haemqvar -c "" IBM.PSSP.aixos.FS.%totused "VG=rootvg;LV=hd3;*"
Purpose
haemtrcoff - Turns tracing off for the Event Manager daemon.
Syntax
haemtrcoff -s subsys_name -a trace_list
Flags
Operands
The following trace arguments may be specified:
Description
The haemtrcoff command is used to turn tracing off for specified activities of the Event Manager daemon. Trace output is placed in an Event Management trace log for the system partition.
Use this command only under the direction of the IBM Support Center. It provides information for debugging purposes and may degrade the performance of the Event Management subsystem or anything else that is running in the system partition. Do not use this command during normal operation.
Files
Security
You must have root privilege to run this command.
Implementation Specifics
This command is part of RS/6000 Cluster Technology (RSCT), which is included with the IBM Parallel System Support Programs (PSSP) Licensed Program (LP).
Prerequisite Information
"The Event Management subsystem" chapter of PSSP: Administration Guide
Location
/usr/sbin/rsct/bin/haemtrcoff
Related Information
Commands: haemctrl, haemd, haemtrcon
Examples
In the following examples, the SP system has two system partitions named sp_prod and sp_test. The instances of the Event Management subsystem on the control workstation of the SP are named haem.sp_prod and haem.sp_test, respectively. The instance of the Event Management subsystem that runs on any node of either system partition is named haem.
haemtrcoff -s haem.sp_prod -a all
haemtrcoff -s haem -a all
haemtrcoff -s haem.sp_test -a init,config
Purpose
haemtrcon - Turns tracing on for the Event Manager daemon.
Syntax
haemtrcon -s subsys_name -a trace_list
Flags
Operands
The following trace arguments may be specified:
Description
The haemtrcon command is used to turn tracing on for specified activities of the Event Manager daemon. Trace output is placed in an Event Management trace log for the system partition. When used, the regs, dinsts , iolists, and olists arguments perform a one-time trace. The specified information is placed in the trace log, but no further tracing is done.
Use this command only under the direction of the IBM Support Center. It provides information for debugging purposes and may degrade the performance of the Event Management subsystem or anything else that is running in the system partition. Do not use this command to turn tracing on during normal operation.
Files
Security
You must have root privilege to run this command.
Implementation Specifics
This command is part of RS/6000 Cluster Technology (RSCT), which is included with the IBM Parallel System Support Programs (PSSP) Licensed Program (LP).
Prerequisite Information
"The Event Management subsystem" chapter of PSSP: Administration Guide
Location
/usr/sbin/rsct/bin/haemtrcon
Related Information
Commands: haemctrl, haemd, haemtrcoff
Examples
In the following examples, the SP system has two system partitions named sp_prod and sp_test. The instances of the Event Management subsystem on the control workstation of the SP are named haem.sp_prod and haem.sp_test, respectively. The instance of the Event Management subsystem that runs on any node of either system partition is named haem.
haemtrcon -s haem.sp_prod -a all
haemtrcon -s haem -a all
haemtrcon -s haem.sp_test -a init,config