IBM Books

Administration Guide


Automatic node unfence

As of the PSSP 3.1 release, the default is for nodes to rejoin the switch communication fabric without any expressed action by the operator. This differs from the default of past releases which required the operator to run either the Estart or Eunfence commands to get nodes to talk on the switch fabric. This function is built into the fault service daemon and replaces the Emonitor daemon function. Notice that the autojoin attribute of the SDR switch_responds class is set whenever nodes join the switch fabric. The autojoin attribute being set has the effect of signaling the switch primary node to unfence it once it is fully operational.

With automatic unfence, if you want to fence a node off the switch fabric and not have it rejoin, run the Efence command which turns off the autojoin attribute in the switch_responds class. If you do not have the autojoin attribute set, the fault service daemon will not unfence it during Estart or automatically. The node will remain fenced until it is either unfenced using the Eunfence command or the autojoin attribute is set in the SDR.

The following example shows all the states a node switch_responds object could be in and how they are treated by the primary.

node_number  switch_responds autojoin     isolated     Description
     1            1            X            X          up on switch
     2            0            0            1          Fenced isolated
     3            0            1            1          Fenced with autojoin
 

|In an SP Switch2 configuration, the pertinent attributes in a node |switch_responds object for one plane are switch_responds0, |autojoin0, isolated0, and for a second plane are |switch_responds1, autojoin1, and |isolated1. Also, you can fence a node off of one switch |plane or both switch planes by using the -p flag of the |Efence command.

When a node is up on the switch fabric, it does not matter how the isolated or autojoin attributes are set. It will remain on the switch until it is fenced, rebooted, or shutdown. The opposite is true of a node that is fenced "isolated". It will remain off the switch fabric until it is unfenced or the autojoin attribute is set. Nodes that are fenced with their autojoin attribute set will get unfenced automatically by the switch primary.

If you do not want to have nodes automatically join the switch, you can turn off automatic unfence. The behavior will be the same as the default of PSSP 2.4 and earlier releases. The autojoin attribute will be turned off whenever the node joins the switch. If the node is fenced, it will remain fenced until it is unfenced using the Eunfence command. You can still fence individual nodes with the autojoin option which will allow the node to be unfenced automatically when the node is rebooted or when the fault service daemon is restarted.

The default is to have automatic unfence enabled. To turn automatic unfence off (or to enable it again after having turned it off), use the Estart command. |In a two-plane SP Switch2 configuration, you can unfence a node on |one switch plane by using the -p flag or on both by letting it |default. To turn automatic unfence off, for both switch planes, run the command:

     Estart -autounfence 0

To turn automatic unfence on, |for both switch planes in a two-plane SP Switch2 configuration, run the command:

     Estart -autounfence 1

See the Efence, Eprimary, and Estart commands in the book PSSP: Command and Technical Reference for more information about using automatic unfence.

Fence the node from the switch

To keep a node off the switch fabric that is about to undergo maintenance or service, run the Efence command on the control workstation. |In a two-plane SP Switch2 configuration, you can fence a node on one |switch plane by using the -p flag or on both by letting it |default. For example to fence node1 and node2 off |the switch, for both switch planes in a two-plane SP Switch2 |configuration, run the command:

Efence node1 node2

Once the command completes, powering the node up, powering it down, or rebooting it will not affect the switch.

Return the node to the switch

To bring a fenced node back into the switch fabric, use the Eunfence command on the control workstation. |In a two-plane SP Switch2 configuration, you can unfence a node on |one switch plane by using the -p flag or on both by letting it |default. For example to have node1 operating with the switch |again, for both planes in a two-plane SP Switch2 configuration, run the command:

Eunfence node1

Alternatively, you can use the SP Hardware perspective to unfence the node if it is already powered on. If the node is powered off, you can automatically bring the node back on the switch when you power it on. You can display the Power On or Cluster Power On dialog for the specific node, select a power on option, and set the check box of the "Enable Autojoin - automatically bring the node back in the switch after it is powered up" option.

The rc.switch /etc/inittab entry

The rc.switch entry in the /etc/inittab file can be changed from the default of once to wait. This change, in combination with the autojoin function, will stop the execution of commands in the subsequent entries of the /etc/inittab file until the css0 IP interface to the switch comes up. |In a two-plane switch configuration, execution will be stopped until |the css1 IP interface also comes up. This allows for the entries that require the switch to be up to be placed after the rc.switch entry in the file.

The use of the once parameter in the following example indicates that execution of the subsequent entries in the /etc/inittab file will continue until the switch's css0, and css1 in a two-plane configuration, IP interface comes up:

fsd:2:once:/usr/lpp/ssp/css/rc.switch

The use of the wait parameter in the following example indicates that the execution of the subsequent entries will wait for the switch IP interface to come up before continuing on to run /etc/inittab.

fsd:2:wait:/usr/lpp/ssp/css/rc.switch

The rc.switch will wait for the unfence function to bring up the switch IP interface after starting the switch fault service daemon. It will only wait three consecutive switch scan periods or approximately 6 minutes. After 6 minutes, the rc.switch will return allowing the rest of /etc/inittab to be executed.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]