Command and Technical Reference, Volume 1

ha_vsd

Purpose

ha_vsd - Starts and restarts the Recoverable Virtual Shared Disk subsystem. This includes configuring virtual shared disks and hashed shared disks as well as activating the recoverability subsystem.

Syntax

ha_vsd [reset]

Flags

None.

Operands

reset: Stops and restarts the IBM Recoverable Virtual Shared Disk subsystem by stopping the Recoverable Virtual Shared Disk and hc subsystems and then starting them again.

Description

Use this command to start the IBM Recoverable Virtual Shared Disk software after you install it, or, with the reset option, to stop and restart the program.

Exit Values

0: Indicates the successful completion of the command.
1: Indicates that an error occurred.

Security

You must have root privilege and write access to the SDR to run this command.

Implementation Specifics

This command is part of the IBM Recoverable Virtual Shared Disk option of PSSP.

Prerequisite Information

See PSSP: Managing Shared Disks.

Location

/usr/lpp/csd/bin/ha_vsd

Related Information

Commands: ha.vsd, hc.vsd

Examples

To stop the Recoverable Virtual Shared Disk subsystem and restart it, enter:

ha_vsd reset

ha.vsd

Purpose

ha.vsd - Queries and controls the activity of the rvsd daemon of the Recoverable Virtual Shared Disk subsystem.

Syntax

ha.vsd: {adapter_recovery [on | off] | debug [off] | mksrc | query | quorum n | qsrc | refresh [noquorum] | reset | reset_quorum | rmsrc | start | stop | trace [off]}

Flags

None.

Operands

adapter_recovery [on | off]

Enables or disables communication adapter recovery. The default is on.

The Recoverable Virtual Shared Disk subsystem must be restarted for this operand to take effect.

debug [off]

Specify debug to redirect the Recoverable Virtual Shared Disk subsystem's standard output and standard error to the console and cause the Recoverable Virtual Shared Disk subsystem to not respawn if it exits with an error. (You can use the lscons command to determine the current console.)

The Recoverable Virtual Shared Disk subsystem must be restarted for this operand to take effect.

Once debugging is turned on and the Recoverable Virtual Shared Disk subsystem has been restarted, ha.vsd trace should be issued to turn on tracing.

Use this operand under the direction of your IBM service representative.

Note: the default when the node is booted is to have standard output and standard error routed to the console. If debugging is turned off standard output and standard error will be routed to /dev/null and all further trace messages will be lost. You can determine if debug has been turned on by issuing ha.vsd qsrc. If debug has been turned on the return value will be:

action = "2"

mksrc

Uses mkssys to create the Recoverable Virtual Shared Disk subsystem.

query

Displays the current status of the Recoverable Virtual Shared Disk subsystem in detail.

quorum n

Sets the value of the quorum, the number of nodes that must be active to direct recovery. Usually, quorum is defined as a majority of the nodes that are defined as virtual shared disk nodes in a system partition, but this command allows you to override that definition. The Recoverable Virtual Shared Disk subsystem must be in the active state when you issue this command. This is not a persistent change.

qsrc

Displays the System Resource Controller (SRC) configuration of the Recoverable Virtual Shared Disk daemon.

refresh [noquorum]

Uses the refresh command to asynchronously start a refresh protocol to all running Recoverable Virtual Shared Disk subsystems. The quorum will be reset before the refresh occurs, unless noquorum is specified. Use ha.vsd query to check for completion. The following items are refreshed in the device driver:

nodes that have been added or deleted
vsds and hsds that have been added or deleted
changed vsd attributes:
- option # cache | nocache
- size_in_MB

reset

Stops and restarts the Recoverable Virtual Shared Disk subsystem.

reset_quorum

Resets the default quorum.

rmsrc

Uses rmssys to remove the Recoverable Virtual Shared Disk subsystem.

start

Starts the Recoverable Virtual Shared Disk subsystem.

stop

Stops the Recoverable Virtual Shared Disk subsystem.

trace [off]

Requests or stops tracing of the Recoverable Virtual Shared Disk subsystem. The Recoverable Virtual Shared Disk subsystem must be in the active state when this command is issued.

This operand is only meaningful after the debug operand has been used to send standard output and standard error to the console and the Recoverable Virtual Shared Disk subsystem has been restarted.

Description

Use this command to display information about the Recoverable Virtual Shared Disk subsystem, to change the number of nodes needed for quorum, and to change the status of the subsystem.

You can start the Recoverable Virtual Shared Disk subsystem with the IBM Virtual Shared Disk Perspective. Type spvsd and select actions for virtual shared disk nodes.

Exit Values

0: Indicates the successful completion of the command.
nonzero: Indicates that an error occurred.

Security

You must have write access to the SDR to run this command. You must have root privilege to issue the debug, quorum, refresh, reset, start, stop, trace, mksrc, and rmsrc subcommands.

Implementation Specifics

This command is part of the Recoverable Virtual Shared Disk option of PSSP.

Prerequisite Information

See PSSP: Managing Shared Disks.

Location

/usr/lpp/csd/bin/ha.vsd

Related Information

Commands: ha_vsd, hc.vsd

Examples

To stop the Recoverable Virtual Shared Disk subsystem and restart it, enter:

ha.vsd reset

The system returns the messages:

Waiting for the rvsd subsystem to exit.
rvsd subsystem exited successfully.
Starting rvsd subsystem.
rvsd subsystem started PID=xxx.

To change the quorum to five nodes of a 16-node SP system, enter:
```
ha.vsd quorum 5
```
The system returns the message:
```
Quorum has been changed from 8 to 5.
```
To query the rvsd subsystem, enter:
```
ha.vsd query
```
The system displays a message similar to the following:
```
Subsystem         Group            PID     Status
      rvsd             rvsd             18320   active
      rvsd(vsd): quorum= 7, active=1, state=idle, isolation=member,
                 NoNodes=10, lastProtocol=nodes_failing,
                 adapter_recovery=on, adapter_status=up,
                 RefreshProtocol has never been issued from this node,
                 Running function level 3.1.0.0.
```
where:

quorum
Is the number of nodes that must join the group before it will be activated.
active
Indicates the activation status of the group that is being joined:

0:
the group is not active (quorum has not been met).
1:
the group is active and the shared disks have been activated.

state
Indicates the current protocol that is running.
isolation
Indicates the group membership status

isolated:
a group "join" has not been proposed.
proposed:
a group "join" has been proposed.
member:
we are a member (provider) of the group.

NoNodes
Indicates the number of nodes that have joined the group
lastProtocol
Indicates the last protocol that was run across the group.
adapter_recovery
Indicates communication adapter recovery support:

on:
adapter recovery is enabled.
off:
adapter recovery is disabled.

adapter_status
Indicates communication adapter status:

up:
the adapter is up.
down:
the adapter is down.
unknown:
the adapter status is unknown.

RefreshProtocol ...
Indicates whether a refresh protocol has been issued from this node. If so, the date and time of success or error will be displayed.
Running function level
Indicates the function level that the subsystem is running, in version, release, modification, fix level format (vrmf). (Coexistence with lower levels of the subsystem, may restrict us to running at a reduced function level.)

hacws_verify

Purpose

hacws_verify - Verifies the configuration of both the primary and backup High Availability Control Workstation (HACWS) control workstations.

Syntax

hacws_verify

Flags

None.

Operands

None.

Description

Use this command to verify that the primary and backup control workstations are properly configured to provide HACWS services to the SP system. The hacws_verify command inspects both the primary and backup control workstations to verify the following:

The HACWS software is properly configured
The resources required to provide control workstation services have been identified to the HACMP software
The HACWS data stored in the System Data Repository (SDR) is correct
The HACMP event scripts supplied by HACWS have been identified to the HACMP software

Both the primary and backup control workstations must be running and capable of executing remote commands via the AIX rsh command.

The system administrator should run the hacws_verify command after HACWS is initially configured. After that, the hacws_verify command can be run at any time.

Exit Values

0: Indicates that no problems were found with the HACWS configuration.
nonzero: Indicates that problems were found with the HACWS configuration.

Security
Restricted Root Access

As of PSSP 3.2, you have the option of running your SP system with an enhanced level of security. With the restricted root access (RRA) option enabled, PSSP does not internally issue rsh and rcp commands as a root user from a node. Also, PSSP does not automatically grant authorization for a root user to issue rsh and rcp commands from a node. If you enable this option, some procedures might not work as documented. For example, to run HACMP, an administrator must grant the authorizations for a root user to issue rsh and rcp commands that PSSP otherwise grants automatically. See the "Planning for security" chapter in IBM RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment for a description of this function and a complete list of limitations.

Restricted Root Access
As of PSSP 3.2, you have the option of running your SP system with an enhanced level of security. With the restricted root access (RRA) option enabled, PSSP does not internally issue rsh and rcp commands as a root user from a node. Also, PSSP does not automatically grant authorization for a root user to issue rsh and rcp commands from a node. If you enable this option, some procedures might not work as documented. For example, to run HACMP, an administrator must grant the authorizations for a root user to issue rsh and rcp commands that PSSP otherwise grants automatically. See the "Planning for security" chapter in IBM RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment for a description of this function and a complete list of limitations.

Prerequisite Information

Refer to PSSP: Administration Guide for additional information on the HACWS option.

Location

/usr/sbin/hacws/hacws_verify

Related Information

PSSP commands: install_hacws, spcw_addevents

AIX commands: rsh

Examples

To verify the HACWS configuration, enter:

/usr/sbin/hacws/hacws_verify

haemcfg

Purpose

haemcfg - Compiles the Event Management objects in the System Data Repository (SDR) and places the compiled information into a binary Event Management Configuration Database (EMCDB) file

Syntax

haemcfg [-c] [-n]

Flags

-c: Indicates that you want to check the data in the System Data Repository (SDR) without building the Event Management Configuration Database (EMCDB).
-n: Indicates that you want to build a test copy of the EMCDB in the current directory.

Operands

None.

Description

The haemcfg utility command builds the Event Management Configuration Database (EMCDB) file for a system partition. If no flags are specified, the haemcfg command:

Compiles the Event Management objects in the System Data Repository (SDR)
Places the compiled information into a binary Event Management Configuration Database (EMCDB) file in a staging directory as /spdata/sys1/ha/cfg/em.syspar_name.cdb, where syspar_name is the system partition name
Updates the haem_cdb_version attribute in the SDR Syspar class for the system partition with the current EMCDB version string. The EMCDB version string contains a timestamp and a sequence number.

To place the new EMCDB into production, you must shut down and restart all of this system partition's Event Manager daemons: the daemon on the control workstation and the daemon on each of the system partition's nodes. When the Event Management daemon restarts, it copies the EMCDB from the staging directory to the production directory. The name of the production EMCDB is /etc/ha/cfg/em. syspar_name.cdb.

If you want to test a new EMCDB, IBM recommends that you create a separate system partition for that purpose.

You must create a distinct EMCDB file for each system partition on the IBM RS/6000 SP. To build an EMCDB file, you must be executing on the control workstation and you must set the SP_NAME environment variable to the appropriate system partition name before you issue the command.

Before you build or replace an EMCDB, it is advisable to issue the haemcfg command with the debugging flags.

The -c flag lets you check the validity of the Event Management data that resides in the SDR. This data was previously loaded through the haemloadcfg command. If any of the data is not valid, the command writes an error message that describes the error.

When the -c flag is processed, the command validates the data in the SDR, but does not create a new EMCDB file and does not update the EMCDB version string in the SDR.

The -n flag lets you build a test EMCDB file in the current directory. If anything goes wrong with the creation of the new file, the command writes an error message that describes the error.

When the -n flag is processed, the command uses the data in the SDR to create a test EMCDB file in the current directory, but it does not update the EMCDB version string in the SDR. If any of the data in the SDR is not valid, the command stops at the first error encountered.

If you specify both flags on the command line, the haemcfg command performs the actions of the -c flag.

After you have checked the data and the build process, issue the haemcfg command without any flags. This builds the new EMCDB file, places it in the /spdata/sys1/ha/cfg directory, and updates the EMCDB version string in the SDR.

Files

/spdata/sys1/ha/cfg/em.syspar_name.cdb: Contains the most recently compiled EMCDB file for the system partition specified by syspar_name. This file will be placed into production when all of the Event Management daemons in the system partition are next restarted.
/etc/ha/cfg/em.syspar_name.cdb: Contains the production EMCDB file for the system partition specified by syspar_name. This EMCDB file is currently in use by the Event Management subsystem.

Standard Output

When the command executes successfully, it writes the following informational messages:

Reading Event Management data for partition syspar_name
 
CDB=new_EMCDB_file_name Version=EMCDB_version_string

Standard Error

This command writes error messages (as necessary) to standard error.

Errors can result from causes that include:

Internal space allocation errors
SDR access errors
Errors that occur while trying to access system partition data in the SDR
Event Management data in the SDR is not valid
EMCDB file access errors
Insufficient user authorization for the command
Trying to update the EMCDB in the /spdata/sys1/ha/cfg directory from a node of the system partition, rather than from the control workstation.

For a listing of the errors that the haemcfg command can produce, see PSSP: Message Reference.

Exit Values

0: Indicates the successful completion of the command.
1: Indicates that an error occurred. It is accompanied by one or more error messages that indicate the cause of the error.

Security

You must have write access to the SDR to run this command.

To place an EMCDB file for a system partition into the /spdata/sys1/ha/cfg directory, you must be running with an effective user ID of root on the control workstation. Before running this command, you must set the SP_NAME environment variable to the appropriate system partition name.

Restrictions

If you run the haemcfg command without any flags, the command stops at the first error it encounters. With the -c flag on, the command continues, letting you obtain as much debugging information as possible in one pass. To reduce your debugging time, therefore, run the command with the debugging flags first.

Implementation Specifics

This command is part of RS/6000 Cluster Technology (RSCT), which is included with the IBM Parallel System Support Programs (PSSP) Licensed Program (LP).

Prerequisite Information

For a general overview of configuring Event Management, see "The Event Management subsystem" chapter of PSSP: Administration Guide.

For a description of the SDR classes and attributes that are related to the EMCDB, see IBM RS/6000 Cluster Technology: Event Management Programming Guide and Reference.

Location

/usr/sbin/rsct/bin/haemcfg

Related Information

Commands: haemloadcfg

Examples

To validate the Event Management data in the System Data Repository (without creating a new EMCDB file), enter:
```
haemcfg -c
```
If there are any errors in the data, the command writes appropriate error messages.
To fix the errors, replace the data in the SDR. For more information, see the haemloadcfg command.
To create a test EMCDB file in the current directory, enter:
```
haemcfg -n
```
If there are any problems in creating the file, the command writes appropriate error messages.
To compile a new EMCDB file for a system partition from the Event Management data that resides in the SDR and place it into the staging directory:
1. Make sure you are executing with root authority on the control workstation.
2. Make sure that the SP_NAME environment variable is set to the name of the appropriate system partition.
3. Enter:
```
haemcfg
```
  In response, the command creates a new EMCDB file, places it in the staging directory as /spdata/sys1/ha/cfg/em.syspar_name.cdb, where syspar_name is the name of the current system partition, and updates the EMCDB version string in the SDR.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]