IBM Books

Command and Technical Reference, Volume 1

Emonitor daemon

Purpose

Emonitor - Monitors nodes listed in the /etc/SP/Emonitor.cfg file in an to attempt to maximize their availability on the switch.

Syntax

Emonitor

Flags

None.

Operands

None.

Description

This command is not valid on a system with an SP Switch2 switch.

Note:
In PSSP 3.1 or later releases, the Emonitor subsystem is no longer needed, since the new Switch Administration daemon and automatic unfence options provide the same functions as the Emonitor subsystem. However, if you turn off the Switch Administration daemon functions you may still want to use the Emonitor subsystem. And if you are using a primary node with a code_version of PSSP 2.4 or earlier in a coexistence environment, the new functions are not supported. You may want to use the Emonitor subsystem in such an environment.

Emonitor is a daemon controlled by the System Resource Controller (SRC). It can be used to monitor nodes in a system partition in regard to the their status on the switch. A system-wide configuration file (/etc/SP/Emonitor.cfg) lists all nodes on the system to be monitored. The objective is to bring these nodes back up on the switch network when necessary.

Emonitor is invoked with Estart -m. Once invoked, it is controlled by SRC so it will restart if it is halted abnormally. If the you decide to end monitoring, you must run /usr/lpp/ssp/bin/emonctrl -k to stop the daemon in your system partition.

There is an Emonitor daemon for each system partition. The daemon watches for any node coming up (for example, host_responds goes from 0 to 1). When the daemon detects a node coming up, it performs a review of the nodes in the configuration file to check if any node is off the switch network. If any nodes in the specified system partition are off the switch network, it determines a way to bring them back onto the the switch (for example, via Eunfence or Estart), and takes the appropriate action. To avoid the Estart command from being run several times (which can occur if multiple nodes are coming up in sequence), Emonitor waits 3 minutes after a node comes up to be sure no other nodes are in the process of coming up. Each time a new node comes up prior to the 3 minute timeout, Emonitor resets the timer to a maximum wait of 12 minutes.

Emonitor cannot always bring nodes back on the switch. For example, if any of the following occur:

Problems can occur if the node that is brought off the switch is experiencing a recurring error that causes it to come up and then encounter an error repeatedly. The monitor continually attempts to bring this node into the switch network and could jeopardize the stability of the remaining switch network.

Note:
Nodes that will be undergoing hardware or software maintenance should be removed from the Emonitor.cfg file during this maintenance to prevent Emonitor from attempting to to bring them onto the switch network.

Files

/etc/SP/Emonitor.cfg
Specifies a list of node numbers, one per line, that the user wants monitored by Emonitor. This list is system-wide.

Security

You must have root privilege to run this command.

Location

|/usr/lpp/ssp/bin/Emonitor

Related Information

Commands: Eannotator, Eclock, Efence, emonctrl, Eprimary, Equiesce, Estart, Etopology, Eunfence, Eupartition

enadmin

Purpose

enadmin - Changes the desired state of a specified extension node.

Syntax

enadmin [-a {reset | reconfigure}] [-h] node_number

Flags

-a
Specifies the desired state to which the extension node is to be set.

reconfigure
Once the administrative state of the extension node is placed in this mode, the Simple Network Management Protocol (SNMP) agent managing the extension node will periodically send trap messages to the spmgrd daemon running on the control workstation requesting configuration data for the extension node. Once the configuration data is received by the agent, it stops sending these requests and uses the configuration data to reconfigure the extension node.

reset
Once the administrative state of the extension node is placed in this mode, the SNMP agent managing the extension node will set the extension node to an initial state in which it is no longer an active node on the switch network.

-h
Displays usage information.

Operands

node_number
Specifies the node number assigned to the extension node whose state is to be changed.

Description

Use this command to change the administrative state of an extension node. Setting the administrative state of an extension node to reconfigure causes configuration data for the extension node to be resent to the extension node's administrative environment. Setting the administrative state of an extension node to reset places the extension node in an initial state in which it is no longer active on the switch network.

This command is invoked internally when choosing the reconfigure option of the endefadapter and endefnode commands or the reset (-r) option of the enrmnode command.

You can use the System Management Interface Tool (SMIT) to run this command by selecting the Extension Node Management panel. To use SMIT, enter:

smit manage_extnode

Standard Output

All informational messages are written to standard output. These messages identify the extension node being changed and indicate when the specified state change has been accepted for processing by the extension node agent (at which point the command is complete). All error messages are also written to standard output.

Exit Values

0
Indicates the administrative state of the extension node was successfully changed.

1
Indicates that an error occurred while processing the command and the administrative state of the extension node was not changed.

Security

You must have root privilege to run this command or be a member of the AIX system group.

Restrictions

This command can only be issued on the control workstation.

Implementation Specifics

This command is part of the IBM Parallel System Support Programs (PSSP) Licensed Program (LP) ssp.spmgr file set.

The spmgrd SNMP manager daemon on the SP control workstation allows transfer of extension node configuration data from the SP system to an SNMP agent providing administrative support for the extension node. Version 1 of the SNMP protocol is used for communication between the SNMP manager and the SNMP agent. Limited control of an extension node is also possible. An SNMP set-request message containing an object instantiation representing the requested administrative state for the extension node is sent from the SNMP manager to the SNMP agent providing administrative support for the extension node. After the administrative state of an extension node is received by the SNMP agent, the enadmin command is completed. Requests for configuration information and information about the state of an extension node are sent to the SNMP manager asynchronously in SNMP trap messages.

Prerequisite Information

IBM RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment

Location

/usr/lpp/ssp/bin/enadmin

Related Information

Commands: endefadapter, endefnode, enrmadapter, enrmnode, spmgrd

Examples

  1. To request that configuration data for the extension node assigned to node number 9 be sent to its SNMP managing agent, enter:
    enadmin -a reconfigure 9
    
  2. To request that the extension node assigned to node number 9 be placed in an initial state and no longer be active on the switch, enter:
    enadmin -a reset 9
    

endefadapter

Purpose

endefadapter - Adds new or changes existing configuration data for an extension node adapter in the System Data Repository (SDR) and optionally performs the reconfiguration request.

Syntax

endefadapter [-a address] [ -h] [-m netmask] [ -r] node_number

Flags

-a address
Specifies the IP network address of the extension node adapter. The IP network address must be able to be resolved by the host command. This flag is required when adding a new extension node adapter.

-h
Displays usage information.

-m netmask
Specifies the netmask for the network on which the extension node adapter resides. This flag is required when adding a new extension node adapter.

-r
Specifies that the extension node adapter will be reconfigured.

Operands

node_number
Specifies the node number for this extension node adapter. This operand is required.

Description

Use this command to define extension node adapter information in the SDR. The -a and -m flags and the node_number operand are required.

You can use the System Management Interface Tool (SMIT) to run this command. To use SMIT, enter:

smit enter_extadapter

Environment Variables

The SP_NAME environment variable is used (if set) to direct this command to a system partition. If the SP_NAME environment variable is not set, the default system partition will be used.

Standard Output

This command writes informational messages to standard output.

Standard Error

This command writes all error messages to standard error.

Exit Values

0
Indicates the successful completion of the command.

1
Indicates that an error occurred and the extension node adapter information was not updated.

Security

You must have root privilege to run this command or be a member of the AIX system group. You must have write access to the SDR to run this command.

Restrictions

This command can only be issued on the control workstation.

Implementation Specifics

This command is part of the IBM Parallel System Support Programs (PSSP) Licensed Program (LP) ssp.basic file set.

Prerequisite Information

IBM RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment

Location

/usr/lpp/ssp/bin/endefadapter

Related Information

Commands: enadmin, endefnode, enrmadapter, enrmnode

Examples

  1. The following example shows the definition of an extension node adapter for node number 10 with a network address of 129.40.158.137 and a netmask of 255.255.255.0, enter:
    endefadapter -a 129.40.158.137 -m 255.255.255.0 10
    
  2. The following example shows the same definition, but the extension node adapter will be reconfigured after the SDR is updated:
    endefadapter -a 129.40.158.137 -m 255.255.255.0 -r 10
    

endefnode

Purpose

endefnode - Adds new or changes existing configuration data for an extension node in the System Data Repository (SDR) and optionally performs the reconfiguration request.

Syntax

endefnode
[-a hostname] [-c string] [-h] [-i string] [-r]
 
[-s hostname] node_number

Flags

-a hostname
Specifies the administrative host name, which can be resolved to an IP address, associated with the extension nodes's network interface on the administrative network. This flag is required when adding a new extension node.

-c string
Specifies the Simple Network Management Protocol (SNMP) community name that the SP SNMP manager and the node's SNMP agent will send in the corresponding field of the SNMP messages. This field consists of 1 to 255 ASCII characters. If the -c flag is not specified, the spmgrd daemon will use a default SNMP community name. For more information about the default community name, refer to the related extension node publication in the "Related Information" section that follows.

-h
Displays usage information.

-i string
Specifies the extension node identifier assigned to the node in its system's administrative environment. This is a text string that uniquely identifies the node to its system. This field consists of 1 to 255 ASCII characters. This flag is required when adding a new extension node.

-r
Specifies that the extension node will be reconfigured.

-s hostname
Specifies the host name that can be resolved to an IP address of the extension node's SNMP agent. This flag is required when adding a new extension node.

Operands

node_number
Specifies the node number for this extension node. The node_number specified in this command must be for an unused standard node position that corresponds to the relative node position assigned to the extension node. Otherwise, there would be a conflict in the switch configuration information. This operand is required.

Description

Use this command to define extension node information in the SDR. When adding a new extension node, the -a, -i, and -s flags and the node_number operand are required. When changing an existing extension node definition, only the node number is required along with the flag corresponding to the field being changed.

You can use the System Management Interface Tool (SMIT) to run this command. To use SMIT, enter:

smit enter_extnode

Environment Variables

The SP_NAME environment variable is used (if set) to direct this command to a system partition. If the SP_NAME environment variable is not set, the default system partition will be used.

Standard Output

This command writes informational messages to standard output.

Standard Error

This command writes all error messages to standard error.

Exit Values

0
Indicates the successful completion of the command.

1
Indicates that an error occurred and the extension node information was not updated.

Security

To run this command you must have root privilege or be a member of the AIX system group. You must also have SDR write access and hardmon access to run this command.

Restrictions

This command can only be issued on the control workstation.

Implementation Specifics

This command is part of the IBM Parallel System Support Programs (PSSP) Licensed Program (LP) ssp.basic file set.

Prerequisite Information

IBM RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment

Location

/usr/lpp/ssp/bin/endefnode

Related Information

Commands: enadmin, endefadapter, enrmnode, enrmadapter

Refer to the SP Switch Router Adapter Guide for information about attaching an IP router extension node to the SP Switch.

Examples

  1. The following example shows a definition of an extension node with a node number of 2 that references slot number 13 in a router:
    endefnode -i 13 -a router1 -s router1 -c spenmgmt 2
    
  2. The following example shows a definition of an extension node with a node number of 7 that references slot number 02 in a router. This extension node will also be reconfigured after the SDR is updated.
    endefnode -i 02 -a grf.pok.ibm.com -s grf.pok.ibm.com -c spenmgmt -r 7
    

enrmadapter

Purpose

enrmadapter - Removes configuration data for an extension node adapter from the System Data Repository (SDR).

Syntax

enrmadapter [-h] node_number

Flags

-h
Displays usage information.

Operands

node_number
Specifies the node number for this extension node adapter.

Description

Use this command to remove extension node adapter information from the SDR. The node_number operand is required.

You can use the System Management Interface Tool (SMIT) to run this command. To use SMIT, enter:

smit delete_extadapter

Environment Variables

The environment variable SP_NAME is used (if set) to direct this command to a system partition. If the SP_NAME environment variable is not set, the default system partition will be used.

Standard Output

This command writes informational messages to standard output.

Standard Error

This command writes all error messages to standard error.

Exit Values

0
Indicates the successful completion of the command.

1
Indicates that an error occurred and the extension node adapter information was not updated.

Security

You must have root privilege to run this command or be a member of the AIX system group. You must have write access to the SDR to run this command.

Restrictions

This command can only be issued on the control workstation.

Implementation Specifics

This command is part of the IBM Parallel System Support Programs (PSSP) Licensed Program (LP) ssp.basic file set.

Prerequisite Information

IBM RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment

Location

/usr/lpp/ssp/bin/enrmadapter

Related Information

Commands: enadmin, endefadapter, endefnode, enrmnode

Examples

To remove an extension node adapter with a node number of 12 from the SDR, enter:

enrmadapter 12

enrmnode

Purpose

enrmnode - Removes configuration data for an extension node in the System Data Repository (SDR).

Syntax

enrmnode [-h] [-r] node_number

Flags

-h
Displays usage information.

-r
Causes the extension node to be reset.

Operands

node_number
Specifies the node number for this extension node.

Description

Use this command to remove extension node information from the SDR. When removing information, the node_number operand is required.

You can use the System Management Interface Tool (SMIT) to run this command. To use SMIT, enter:

smit delete_extnode

Environment Variables

The environment variable SP_NAME is used (if set) to direct this command to a system partition. If the SP_NAME environment variable is not set, the default system partition will be used.

Standard Output

This command writes informational messages to standard output.

Standard Error

This command writes all error messages to standard error.

Exit Values

0
Indicates the successful completion of the command.

1
Indicates that an error occurred and the extension node information was not updated.

Security

You must have root privilege to run this command or be a member of the AIX system group. You must have write access to the SDR to run this command.

Restrictions

This command can only be issued on the control workstation.

Implementation Specifics

This command is part of the IBM Parallel System Support Programs (PSSP) Licensed Program (LP) ssp.basic file set.

Prerequisite Information

IBM RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment

Location

/usr/lpp/ssp/bin/enrmnode

Related Information

Commands: enadmin, endefadapter , endefnode, enrmadapter

Examples

To remove an extension node with a node number of 2 from the SDR and reset that extension node, enter:

enrmnode -r 2

Eprimary

Purpose

Eprimary - Assigns or queries the switch primary node and switch primary backup node for a system partition.

Syntax

|
|Eprimary
|[-h] [-p |{0|1|all}] [-init] |[node_identifier] |
| 
|[-backup bnode_identifier]

Flags

-h
Displays usage information.

-init
Initializes or reinitializes the current system partition object. If -init is specified without a node_identifier or without a bnode_identifier, the respective default is used for the primary and primary backup nodes. The lowest numbered node in the system partition is the default primary node, and the furthest node from the primary is the default primary backup node.

When a new system partition object is created, the SDR autounfence attribute is set to "1" (enabled). This attribute determines whether automatic unfence will be enabled or disabled by the Fault Service daemon during switch initialization. Use the Estart command to change the value of this attribute. |

|-backup bnode_identifier
|Specifies the node designated as the oncoming switch primary backup |node. It can be a host name, an IP address, a frame,slot pair, or a |node number. If a bnode_identifier is not specified, the |oncoming primary backup node is automatically selected.

|Notes:

  1. |A dependent node cannot be selected as a primary or primary |backup node.

  2. |A node that does not have a switch adapter cannot be selected as |a primary or primary backup node. |
|

|-p {0|1|all}
|Specifies for which switch plane the operation is to be performed. |If not specified, the default is to perform the operation for all valid switch |planes. This flag is valid only on systems with SP Switch2 |switches.

Operands

|node_identifier
|Specifies the node designated as the oncoming switch primary node. |It can be a host name, an IP address, a frame,slot pair, or a node |number. If a node_identifier is not specified, the oncoming |primary node is automatically selected.

|Notes:

  1. |A dependent node cannot be selected as a primary or primary |backup node.

  2. |A node that does not have a switch adapter cannot be selected as |a primary or primary backup node. |

Note:
If no flags or operands are specified, each of the following is displayed:

Description

Use this command to assign, change, or query the switch primary node or the switch primary backup node. The primary node should not be changed unless the current primary node is becoming unavailable (for example, if the current primary node is to be serviced). The Estart command must be issued before a change of the primary node or the primary backup node (using Eprimary) takes effect. Also, the Estart command must be used if the value of the autounfence attribute needs to be changed.

In an SP Switch network, the primary node takeover facility automatically handles situations (such as a node loss) for each of the primary and primary backup nodes. The primary node replaces a problem primary backup node and the primary backup node automatically takes over for the primary node if the primary node becomes unavailable. Note that the node chosen cannot be a dependent node. The primary backup node should be selected using the following guidelines:

The Eprimary command selects a default oncoming primary or oncoming backup primary node if one is not specified. Users receive a warning in the following situations on the oncoming primary or oncoming backup primary nodes:

|Environment Variables

|PSSP 3.4 provides the ability to run commands using secure remote |command and secure remote copy methods.

|To determine whether you are using either AIX rsh or rcp |or the secure remote command and copy method, the following environment |variables are used. |If no environment variables are set, the defaults are |/bin/rsh and /bin/rcp.

|You must be careful to keep these environment variables consistent. |If setting the variables, all three should be set. The DSH_REMOTE_CMD |and REMOTE_COPY_CMD executables should be kept consistent with the choice of |the remote command method in RCMD_PGM: |

|For example, if you want to run Eprimary using a secure remote |method, enter:

|export RCMD_PGM=secrshell
|export DSH_REMOTE_CMD=/bin/ssh
|export REMOTE_COPY_CMD=/bin/scp

Security

You must have root privilege to run this command.

|When restricted root access (RRA) is enabled, this command can only |be run from the control workstation.

Location

/usr/lpp/ssp/bin/Eprimary

Related Information

Commands: Eannotator, Eclock, Efence, Equiesce, Estart, Etopology, Eunfence, Eunpartition

Examples

  1. To query the switch primary and primary backup nodes, enter:
    Eprimary
    
  2. To designate an oncoming switch primary node by IP address and let Eprimary select an oncoming switch primary backup node, enter:
    Eprimary 129.33.34.1
    
  3. To designate an oncoming switch primary node and an oncoming switch primary backup node by IP address, enter:
    Eprimary 129.33.34.1 -backup 129.33.34.56
    
  4. To designate an oncoming switch primary node and an oncoming switch primary backup node by host name, enter:
    Eprimary r11n01 -backup r17n02
    
  5. To create a system partition object and assign a switch primary backup node by a frame,slot, enter:
    Eprimary -init 1,2 -backup 1,6
    
  6. |On an SP Switch2 system, to designate an oncoming switch primary |node and an oncoming switch primary backup node by node number for just the |second plane in a system, enter:
    |Eprimary 1 -backup 5 -p 1

Equiesce

Purpose

Equiesce - Quiesces the switch by causing the primary and primary backup nodes to shut down switch recovery and primary node takeover.

Syntax

Equiesce [-h] |[-p {0|1|all}]

Flags

-h
Displays usage information. |

|-p {0|1|all}
|Specifies for which switch plane the operation is to be performed. |If not specified, the default is to perform the operation for all valid switch |planes. This flag is valid only on systems with SP Switch2 |switches.

Operands

None.

Description

Use this command to disable switch error recovery and primary node takeover. It is used to shut down normal switch error actions when global activities affecting nodes are performed. For example, when all nodes are shutdown or rebooted, they are fenced from the switch by the primary node.

If the primary node is not the first node to shut down during a global shutdown or reboot of the entire system, it may fence all the other nodes including the primary backup node. Primary node takeover can also occur if the primary node is shut down and the backup node remains up. Issuing the Equiesce command before the shutdown prevents these situations from occurring.

The Equiesce command causes the primary and primary backup nodes to shut down their recovery actions. Data still flows over the switch, but no problems are serviced and primary node takeover is disabled. Only the Eannotator, Eclock, Eprimary, Estart, and Etopology commands are functional after the Equiesce command is issued.

Estart must be issued when the global activity is complete to reestablish switch recovery and primary node takeover.

Note:
|The Switch Administration daemon will issue the Estart |command under certain circumstances, thus reestablishing switch recovery and |primary node takeover. To see if you are using the Switch |Administration daemon, issue the following command:
|lssrc -a | grep swtadm

|If the response returned shows that the swtadmd or swtadmd2 subsystem is |active, you may want to turn off the Switch Administration daemon before |issuing the Equiesce command. To turn off the Switch |Administration daemon, issue the following command for the SP Switch:

|stopsrc -s swtadmd

|For the SP Switch2 issue the following command:

|stopsrc -s swtadmd2

|After issuing the Estart command again, you may want to restart |the Switch Administration daemon. To turn on the Switch Administration |daemon, issue the following command for the SP Switch:

|startsrc -s swtadmd

|For the SP Switch2 issue the following command:

|startsrc -s swtadmd2
|

|Environment Variables

|PSSP 3.4 provides the ability to run commands using secure remote |command and secure remote copy methods.

|To determine whether you are using either AIX rsh or rcp |or the secure remote command and copy method, the following environment |variables are used. |If no environment variables are set, the defaults are |/bin/rsh and /bin/rcp.

|You must be careful to keep these environment variables consistent. |If setting the variables, all three should be set. The DSH_REMOTE_CMD |and REMOTE_COPY_CMD executables should be kept consistent with the choice of |the remote command method in RCMD_PGM: |

|For example, if you want to run Equiesce using a secure remote |method, enter:

|export RCMD_PGM=secrshell
|export DSH_REMOTE_CMD=/bin/ssh
|export REMOTE_COPY_CMD=/bin/scp

Security

You must have root privilege to run this command.

|When restricted root access (RRA) is enabled, this command can only |be run from the control workstation.

Location

/usr/lpp/ssp/bin/Equiesce

Related Information

Commands: Eannotator, Eclock, Efence, Eprimary, Estart, Etopology, Eunfence, Eunpartition

Examples

  1. To quiesce the switch before shutting down the system, enter:
    Equiesce
    
  2. |On an SP Switch2 system, to quiesce just the second plane of |switches, enter:
    |Equiesce -p 1

Estart

Purpose

Estart - Starts the switch.

Syntax

Estart [-h] [-m] [-autounfence 0 | 1] |[-p {0|1|all}]

Flags

-h
Displays usage information.

-m
Specifies that the Emonitor daemon should be started. (See /etc/SP/Emonitor.cfg for details.) The -m flag is valid only on systems with an SP Switch.

-autounfence
For SP Switch systems, specifies whether automatic unfence will be enabled (1) or disabled (0). The specified value will be used to update the SDR Switch_partition autounfence attribute. For SP Switch2 systems, specifies whether automatic unfence will be enabled (1) or disabled (0). The specified value will be used to update the SDR Switch_plane autounfence attribute. If this flag is not specified, the current value of the autounfence attribute will be used by the Fault Service daemon on the primary node during switch initialization. Use the Eprimary command to display the SDR autounfence attribute.
Note:
In PSSP 3.1 or later releases, the Emonitor subsystem is no longer needed, since the new Switch Administration daemon and automatic unfence options provide the same functions as the Emonitor subsystem. However, if you turn off the Switch Administration daemon functions you may still want to use the Emonitor subsystem. And if you are using a primary node with a code_version of PSSP 2.4 or earlier in a coexistence environment, the new functions are not supported. You may want to use the Emonitor subsystem in such an environment.
|

|-p {0|1|all}
|Specifies for which switch plane the operation is to be performed. |If not specified, the default is to perform the operation for all valid switch |planes. This flag is valid only on systems with SP Switch2 |switches.

Operands

None.

Description

Use this command to start or restart the current system partition based on its switch topology file. (Refer to the Etopology command for topology file details.) If the -m flag is specified, it will also start the Emonitor daemon to monitor nodes on the switch. Refer to the Emonitor daemon for additional information. If the Estart command is issued when the switch is already running, it causes a switch error, and messages in flight are lost. Applications using reliable protocols on the switch, such as TCP/IP and the MPI User Space library, recover from switch errors. Applications using unreliable protocols on the switch do not recover from switch errors. For this reason, IBM suggests that you be aware of what applications or protocols you are running before you issue the Estart command.

If the current primary node is unsuitable for use to start the switch, the current primary backup node is examined. If it is suitable to use as the new current primary node, Eprimary is run to make it the new oncoming primary node before starting the switch.

If automatic unfence is enabled, the autojoin bit for all active nodes will be turned on and the scan will be enabled to check for nodes ready to join the switch. If automatic unfence is disabled, the autojoin bit for all active nodes will be turned off and the Fault Service daemon will not scan for nodes ready to join the switch. For nodes not on the switch during switch initialization, the autojoin bit will remain unchanged.

|For a fenced node to automatically join the switch during switch |initialization, its autojoin bit must be on. This means that |even if autofence is enabled and a node is fenced without |autojoin, the node will not come back on the switch without issuing |the Eunfence command. (See the Efence |command.)

SP Switch Notes:

If you have an SP Switch installed on your system, an oncoming primary node as selected via Eprimary is established as primary during Estart. If necessary, the topology file is distributed to partition nodes during Estart. The topology file to be used is distributed to each of the standard nodes in the system partition via the |SP Ethernet administrative local area network (LAN) |adapter:

Otherwise, the topology file is already resident on the nodes and does not need to be distributed.

|SP Switch2 Notes:

|On SP Switch2 systems, usually the topology file that will be used |is from the SDR. However, in a one plane system, if there is an |/etc/SP/expected.top file on the primary node, that topology |file will be used for the Estart. In a two plane system, if |there is an /etc/SP/expected.top.p0 file on the primary |node, that file will be used to Estart plane 0. In addition, |if there is an /etc/SP/expected.top.p1 file on the |primary node, that file will be used to Estart plane 1.

Files

/etc/SP/Emonitor.cfg
The list of nodes that the user wants monitored via the Emonitor daemon (not partition sensitive).

/var/adm/SPlogs/css/dist_topology.log
Contains system error messages if any occurred during the distribution of the topology file to the nodes.

|Environment Variables

|PSSP 3.4 provides the ability to run commands using secure remote |command and secure remote copy methods.

|To determine whether you are using either AIX rsh or rcp |or the secure remote command and copy method, the following environment |variables are used. |If no environment variables are set, the defaults are |/bin/rsh and /bin/rcp.

|You must be careful to keep these environment variables consistent. |If setting the variables, all three should be set. The DSH_REMOTE_CMD |and REMOTE_COPY_CMD executables should be kept consistent with the choice of |the remote command method in RCMD_PGM: |

|For example, if you want to run Estart using a secure remote |method, enter:

|export RCMD_PGM=secrshell
|export DSH_REMOTE_CMD=/bin/ssh
|export REMOTE_COPY_CMD=/bin/scp

Security

You must have root privilege to run this command.

|When restricted root access (RRA) is enabled, this command can only |be run from the control workstation.

Location

/usr/lpp/ssp/bin/Estart

Related Information

Commands: Eannotator, Eclock, Efence, Eprimary, Equiesce, Etopology, Eunfence, Eunpartition

Refer to IBM RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment for details about system partition topology files.

Examples

  1. To start the SP Switch, enter:
    Estart
    
  2. To start the SP Switch and turn off automatic unfence, enter:
    Estart -autounfence 0
    
  3. |On an SP Switch2 system, to start the second plane of switches, |enter:
    |Estart -p 1

Etopology

Purpose

Etopology - Stores or reads a switch topology file into or out of the System Data Repository (SDR).

Syntax

|Etopology [-h] |[-p {0|1|all}] |{switch_topology_file | -d | -read |output_file}

Flags

-h
Displays usage information. |

|-read output_file
|Retrieves the current switch topology file out of the SDR and stores it in |the specified switch_topology_file. If -read is |not specified, the specified switch_topology_file will be stored in |the SDR. If -p all is specified (or defaulted) in a two |plane system, each output topology file name will be shown as:
|switch_topology_file.pplane_number

-d
Specifies that the topology filename is to be determined from the contents of the SDR, based on the number and type of switches in the system. On an SP Switch system, this flag is only valid on a system which has not been partitioned. |

|-p {0|1|all}
|Specifies for which switch plane the operation is to be performed. |If not specified, the default is to perform the operation for all valid switch |planes. This flag is valid only on systems with SP Switch2 |switches.

Operands

switch_topology_file
Specifies the full path name of the file into which the current SDR switch topology is to be copied, or the full path name of a switch topology file to store in the SDR. A sequence number is appended to this file name when it is stored in the SDR. This is used to ensure that the appropriate topology file is distributed to the nodes of the system partition.

Description

Use this command to store or retrieve the switch_topology_file into or out of the SDR. The switch topology file is used by switch initialization when starting the switch for the current system partition. It is stored in the SDR and can be overridden by having a switch topology file in the /etc/SP directory named expected.top on the switch primary node.

If you have an SP Switch installed on your system, the current topology file is copied to each node of the subject system partition during an Estart and to each targeted node for an Eunfence.

Files

/etc/SP/expected.top.1nsb_8.0isb.0
The standard topology file for systems with a maximum of eight nodes.

/etc/SP/expected.top.1nsb.0isb.0
The standard topology file for one Node Switch Board system or a maximum of 16 nodes.

/etc/SP/expected.top.2nsb.0isb.0
The standard topology file for two NSB systems or a maximum of 32 nodes.

/etc/SP/expected.top.3nsb.0isb.0
The standard topology file for three NSB systems or a maximum of 48 nodes.

/etc/SP/expected.top.4nsb.0isb.0
The standard topology file for four NSB systems or a maximum of 64 nodes.

/etc/SP/expected.top.5nsb.0isb.0
The standard topology file for five NSB systems or a maximum of 80 nodes.

/etc/SP/expected.top.5nsb.4isb.0
The standard topology file for five NSB and four Intermediate Switch Board (ISB) systems or a maximum of 80 nodes. This is an advantage-type network with a higher bisectional bandwidth.

/etc/SP/expected.top.6nsb.4isb.0
The standard topology file for six NSB and four ISB systems or a maximum of 96 nodes.

/etc/SP/expected.top.7nsb.4isb.0
The standard topology file for seven NSB and four ISB systems or a maximum of 112 nodes.

/etc/SP/expected.top.8nsb.4isb.0
The standard topology file for eight NSB and four ISB systems or a maximum of 128 nodes.

/etc/SP/expected.top.1nsb_8.0isb.1
The standard topology file for systems with an SP Switch-8 and a maximum of eight nodes.

Security

You must have root privilege to run this command.

Location

/usr/lpp/ssp/bin/Etopology

Related Information

Commands: Eannotator, Eclock , Efence, Eprimary , Equiesce, Estart , Eunfence, Eupartition

Refer to IBM RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment for information on system partition configurations and topology files.

Examples

  1. To store a topology file for a system with up to 96 nodes in the SDR, enter:
    Etopology /etc/SP/expected.top.6nsb.4isb.0
    
  2. To store a topology file for a system with up to 16 nodes in the SDR, enter:
    Etopology /etc/SP/expected.top.1nsb.0isb.0
    
  3. To retrieve a topology file out of the SDR and store it to a file, enter:
    Etopology -read /tmp/temporary.top
    
  4. |On an SP Switch2 system, to store a topology file in the SDR for |just the second plane of an SP Switch2 system with up to 16 nodes, |enter:
    |Etopology /etc/SP/expected.top.1nsb.0isb.0 -p 1
  5. |To retrieve topology files out of the SDR and store them to files in |a two plane system, enter:
    |Etopology -p all -read /tmp/temporary.top
    The resulting topology files will be stored in |/tmp/temporary.top.p0 and |/tmp/temporary.top.p1

Eunfence

Purpose

Eunfence - Adds an SP node to the current active switch network that was previously removed from the network.

Syntax

Eunfence [-h | [-G] |[-p {0|1|all}] node_specifier [node_specifier2] ...

Flags

-h
Displays usage information.

-G
Unfences all valid nodes in the list of nodes regardless of system partition boundaries. If the -G flag is not used, the Eunfence command will only unfence the nodes in the current system partition. All other specified nodes will not be unfenced and a nonzero return code is returned. |

|-p {0|1|all}
|Specifies for which switch plane the operation is to be performed. |If not specified, the default is to perform the operation for all valid switch |planes. This flag is valid only on systems with SP Switch2 |switches.

Operands

|node_specifier
|Specifies a list of nodes that is to rejoin the current switch |network. It can be a list of host names, IP addresses, node numbers, |frame,slot pairs, or a node group.
|Note:
You cannot fence a node that does not have a switch adapter. |

Description

Use this command to allow a node to rejoin the current switch network that was previously removed with the Efence command.

You can also use this command to allow a node to rejoin the switch network if that node was previously removed from the SP Switch network due to a switch or adapter error.

SP Switch Note:

Eunfence first distributes the current topology file to the nodes before they can be unfenced.

Note:
If a host name or IP address is used as the node_specifier for a dependent node, it must be a host name or IP address assigned to the adapter that connects the dependent node to the SP Switch. Neither the administrative host name nor the Simple Network Management Protocol (SNMP) agent's host name for a dependent node is guaranteed to be the same as the host name of its switch network interface.

Eunfence attempts to start the fault-service daemon, if it is not currently running, on a node which is to rejoin the current switch network.

Files

/var/adm/SPlogs/css/dist_topology.log
Contains system error messages if any occurred during the distribution of the topology file to the nodes.

|Environment Variables

|PSSP 3.4 provides the ability to run commands using secure remote |command and secure remote copy methods.

|To determine whether you are using either AIX rsh or rcp |or the secure remote command and copy method, the following environment |variables are used. |If no environment variables are set, the defaults are |/bin/rsh and /bin/rcp.

|You must be careful to keep these environment variables consistent. |If setting the variables, all three should be set. The DSH_REMOTE_CMD |and REMOTE_COPY_CMD executables should be kept consistent with the choice of |the remote command method in RCMD_PGM: |

|For example, if you want to run Eunfence using a secure remote |method, enter:

|export RCMD_PGM=secrshell
|export DSH_REMOTE_CMD=/bin/ssh
|export REMOTE_COPY_CMD=/bin/scp

Security

You must have root privilege to run this command.

|When restricted root access (RRA) is enabled, this command can only |be run from the control workstation.

Location

/usr/lpp/ssp/bin/Eunfence

Related Information

Commands: Eannotator, Eclock, Efence, Eprimary, Equiesce, Estart, Etopology, Eunpartition

Examples

  1. To unfence a node by IP address, enter:
    Eunfence 129.33.34.1
    
  2. To unfence two nodes by host name, enter:
    Eunfence r11n01 r11n04
    
  3. To unfence several nodes by node number, enter:
    Eunfence 34 43 20 76 40
    
  4. To unfence node 14 of frame 2 by frame,slots pairs, enter:
    Eunfence 2,14
    
  5. If the current system partition has nodes with node numbers 1, 2, 5, and 6 and another system partition has nodes with node numbers 3, 4, 7, and 8, issuing the command:
    Eunfence 5 6 7 8
    
    unfences nodes 5 and 6, but not nodes 7 and 8. As a result, the command returns a nonzero return code.
  6. To successfully unfence the nodes in example 5 with the same system partitions, use the -G flag as follows:
    Eunfence -G 5 6 7 8
    
  7. |On an SP Switch2 system, to unfence node number 5 on just the second |switch plane, enter:
    |Eunfence -p 1 5

Eunpartition

Purpose

|Eunpartition - Prepares a system for partitioning |activity. This command should be run before creating new partitions or |when merging system partitions. | |

Syntax

Eunpartition [-h]

Flags

-h
Displays usage information.

If a flag is not specified, Eunpartition examines the SP_NAME shell variable and selects a system partition based on its current setting.

Operands

None.

Description

|This Eunpartition command is not valid on a system with an |SP Switch2 switch or on a switchless clustered enterprise server |system.

Use this command to prepare |an SP system for a new system partition definition within an SP cluster.

This command must be executed for each system partition prior to the spapply_config command to redefine system partitions.

If you specify Eunpartition in error, it will quiesce the primary and primary backup nodes. If this occurs, you must use Estart to restart the switch.

|If the Switch Administration daemon is running, it should be stopped |before running the Eunpartition command. To stop the Switch |Administration daemon, issue the following for an SP Switch:

|stopsrc -s swtadmd

|After partitioning has completed (spapply_config is |complete), use the startsrc command to restart the Switch |Administration daemon.

|Environment Variables

|PSSP 3.4 provides the ability to run commands using secure remote |command and secure remote copy methods.

|To determine whether you are using either AIX rsh or rcp |or the secure remote command and copy method, the following environment |variables are used. |If no environment variables are set, the defaults are |/bin/rsh and /bin/rcp.

|You must be careful to keep these environment variables consistent. |If setting the variables, all three should be set. The DSH_REMOTE_CMD |and REMOTE_COPY_CMD executables should be kept consistent with the choice of |the remote command method in RCMD_PGM: |

|For example, if you want to run Eunpartition using a secure remote |method, enter:

|export RCMD_PGM=secrshell
|export DSH_REMOTE_CMD=/bin/ssh
|export REMOTE_COPY_CMD=/bin/scp

Security

You must have root privilege to run this command.

|When restricted root access (RRA) is enabled, this command can only |be run from the control workstation.

Location

/usr/lpp/ssp/bin/Eunpartition

Related Information

Commands: Eannotator, Eclock, Efence, Eprimary, Equiesce, Estart, Etopology, Eunfence

Examples

To prepare the current system partition for repartitioning as specified by SP_NAME, enter:

Eunpartition

export_clients

Purpose

export_clients - Creates or updates the Network File System (NFS) export list for a boot/install server.

Syntax

export_clients [-h]

Flags

-h
Displays usage information. If the command is issued with the -h flag, the syntax description is displayed to standard output and no other action is taken.

Operands

None.

Description

Use this command to create or update the NFS export list on a boot/install server node.

Standard Error

This command writes error messages (as necessary) to standard error.

Exit Values

0
Indicates the successful completion of the command.

-1
Indicates that an error occurred.

Security

You must have root privilege to run this command.

Implementation Specifics

This command is part of the IBM Parallel System Support Programs (PSSP) Licensed Program (LP).

Location

/usr/lpp/ssp/bin/export_clients

Related Information

Commands: setup_server

Examples

To create or update the NFS export list on a boot/install server node, enter:

export_clients

ext_srvtab

Purpose

ext_srvtab - Extracts service key files from the Kerberos Version 4 authentication database.

Syntax

ext_srvtab [-n] [-r realm] [instance ...]

Flags

-n
If specified, the master key is obtained from the master key cache file. Otherwise, ext_srvtab prompts the user to enter the master key interactively.

-r
If specified, the realm fields in the extracted file match the given realm rather than the local realm.

Operands

instance
Specifies an instance name. On the SP system, service instances consist of the short form of the network names for the hosts on which the service runs.

Description

The ext_srvtab command extracts service key files from the Kerberos Version 4 authentication database. The master key is used to extract service key values from the database. For each instance specified on the command line, the ext_srvtab command creates a new service key file in the current working directory with a file name of instance-new-srvtab which contains all the entries in the database with an instance field of instance. This new file contains all the keys registered for instances of services defined to run on that host. A user must have read access to the authentication database to execute this command. This command can only be issued on the system on which the authentication database resides.

Files

instance-new-srvtab
Service key file generated for instance.

/var/kerberos/database/principal.pag, /var/kerberos/database/principal.dir
Files containing the authentication database.

/.k
Master key cache file.

Security

You must have root privilege to run this command.

Location

/usr/lpp/ssp/kerberos/etc/ext_srvtab

Related Information

Commands: kadmin, ksrvutil

Refer to the "RS/6000 SP files and other technical information" section of PSSP: Command and Technical Reference for additional Kerberos information.

Examples

If a system has three network interfaces named as follows:

ws3e.abc.org
ws3t.abc.org
ws3f.finet.abc.org

to re-create the server key file on this workstation (that is an SP authentication server), user root could do the following:

# create a new key file in the /tmp  directory for each instance
# Combine the instance files into a single file for the hostname.
# Delete temporary files and protect key file
cd /tmp
/usr/kerberos/etc/ext_srvtab -n ws3e ws3t ws3f
/bin/cat ws3e-new-srvtab ws3t-new-srvtab ws3f-new-srvtab \
   >/etc/krb-srvtab
/bin/rm ws3e-new-srvtab ws3t-new-srvtab ws3f-new-srvtab
/bin/chmod 400 /etc/krb-srvtab

fencevsd

Purpose

fencevsd - Prevents an application running on a node or group of nodes from accessing a virtual shared disk or group of virtual shared disks.

Syntax

|fencevsd {-a | -v |vsd_name_list} -n node_list

Flags

|-a
|Specifies all virtual shared disks.

|-v vsd_name_list
|Specifies one or more virtual shared disk names, separated by |commas.

|-n node_list
|Specifies one or more node numbers, separated by commas.

Operands

None.

Description

Under some circumstances, the system may believe a node has stopped functioning and begin recovery procedures, when the node is actually operational, but cut off from communication with other nodes running the same application. In this case, the problem node must not be allowed to serve requests for the virtual shared disks it normally serves until recovery is complete and the other nodes running the application recognize the problem node as operational. The fencevsd command prevents the problem node from filling requests for its virtual shared disks.

This command can be run from any node where the IBM Recoverable Virtual Shared Disk subsystem is running.

Security

You must be in the AIX bin group and have write access to the SDR to run this command.

Prerequisite Information

PSSP: Managing Shared Disks

Location

/usr/lpp/csd/bin/fencevsd

Related Information

Commands: lsfencevsd, lsvsd, unfencevsd, updatevsdtab, vsdchgserver

Refer to PSSP: Managing Shared Disks for information on how to use this command in writing applications.

Examples

To fence the virtual shared disks vsd1 and vsd2 from node 5, enter:

fencevsd -v vsd1,vsd2 -n 5

get_keyfiles

Purpose

get_keyfiles - Initiates transfer of Kerberos V4 srvtab file from the control workstation to the newly created node for Kerberos V4authentication.

Syntax

get_keyfiles {keyfile server}

Flags

None.

Operands

keyfile
Specifies the Kerberos V4 srvtab file for the node executing the command.

server
Specifies the control workstation which holds the Kerberos V4 srvtab file service keyfiles for all SP nodes.

Description

This program stops all getty processes and removes the "cons" entry from inittab so that it cannot restart until this command is completed. get_keyfiles sends a request to the control workstation, and listens on /dev/tty0 for keyfiles. Keyfiles are sent in a uuencoded format over s1term by the control workstation. This program will uudecode the keyfile to its original format and place them in the /spdata/sys1/k4srvtabs directory for the calling program. Once the keyfile transfer is completed, the "cons" entry is added back to inittab, and inittab is refreshed.

Files

The log file /var/adm/SPlogs/get_keyfiles/get_keyfiles.log is created.

Exit Values

0
Indicates the successful completion of the command.

1
Indicates that an error or errors occurred.

An unsuccessful run of this command, depending on where an error occurred, will result in the keyfile transfer being unsuccessful.

Security

You must have root privilege to run this command.

Restrictions

This program works with the kfserver function on the control workstation. The kfserver function uses the s1term in write mode. Only one s1term in write mode to a node is allowed at any given time.

Location

/usr/lpp/ssp/bin/get_keyfiles

Examples

To get file c58n01-new-srvtab from c58s.ppd.pok.ibm.com (the control workstation), enter:

get_keyfiles c58n01-new-srvtabo c58s.ppd.pok.ibm.com

get_vpd

Purpose

get_vpd - Consolidates the Vital Product Data (VPD) files for the nodes and writes the information to a file and optionally to a diskette.

Syntax

|get_vpd [-h] |[-d] [-m model_number |-s serial_number]

Flags

-h
Displays usage information.

-d
Specifies that the Vital Product Data file will be written to a diskette.

-m model_number
Specifies the machine type model number. The value of the model number is " MMx", where MM is the class of the machine:
20
No switch, 2 - 64 nodes
2A
No switch, 2 - 8 nodes, 49 inch height
3A
8-port switch, 2 - 8 nodes, 49 inch height
38
8-port switch, 2 - 8 nodes, 79 inch height
30
Single-staged switching, 2 - 80 nodes
40
Dual-staged switching, 62 - 128 nodes |
|500
|49 inch frame |
|550
|79 inch frame |
|50H
|49 inch legacy frame with conversion for PCI nodes |
|55H
|Legacy frame conversion from 2XX,3XX,4XX to support PCI nodes |
|555
|Multiple NSB frame for the SP Switch2 in SP or clustered enterprise server |systems

-s serial_number
Specifies the serial number. The value of the serial_number is "pp00sssss", where:

pp
Is 02 for machines built in US (Poughkeepsie) and 51 for machines built in EMEA (Montpelier).

00
Is a mandatory value.

sssss
Is the serial number of the machine.

Description

Use this command to consolidate the Vital Product Data (VPD) for the nodes in the RS/6000 SP into a file and to optionally write the file to diskette. The diskette created by this command is sent to IBM manufacturing when an upgrade to the RS/6000 SP hardware is desired. This diskette is used by manufacturing and marketing to configure an upgrade of the RS/6000 SP.

The get_vpd command is issued by IBM field personnel to capture VPD information after an upgrade of the system. All installation and configuration of the RS/6000 SP must be complete prior to issuing the get_vpd command.

|If the -m and -s flags are not |specified, the machine type model number and serial number must be available |from the SP class in the SDR. To determine if these values are set, run |the splstdata -e command. These values will be set when |the get_vpd command is run with the -m and |-s flags, or they can be set using the spsitenv |command.

Files

/var/adm/SPlogs/SPconfig/node_number.umlc
Files used as input to this command.

/var/adm/SPlogs/SPconfig/serial_number.vpd
Output file generated by this command.

This command creates the /var/adm/SPlogs/SPconfig/serial_number.vpd file and optionally writes the file to a diskette.

Standard Error

This command writes all error messages to standard error.

Exit Values

0
Indicates the successful completion of the command.

1
Indicates that an error occurred while processing the VPD information and the command did not complete successfully.

Security

You must have root privilege and hardmon access to run this command.

Restrictions

This command can only be issued on the control workstation.

The command always generates output in English using the default C locale. It ignores the current operating locale.

Implementation Specifics

This command is part of the IBM Parallel System Support Programs (PSSP) Licensed Program (LP) ssp.basic file set.

Prerequisite Information

IBM RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment

Location

/usr/lpp/ssp/install/bin/get_vpd

Examples

  1. This example shows the creation of a file containing all of the node VPD information for a model type of 204 and a serial number of 020077650. The output is written to /var/adm/SPlogs/SPconfig/020077650.vpd.
    get_vpd -m 204 -s 020077650
    
  2. This example shows the creation of a file containing all of the node VPD information for a model type of 306 and a serial number of 510077730. The output is written to /var/adm/SPlogs/SPconfig/510077730.vpd and also to diskette.
    get_vpd -m 306 -s 510077730 -d
    


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]