Administration Guide

Updating the host_responds class (the hrd daemon)

The host_responds class in the SDR is updated automatically by the hrd daemon. The hrd daemon is called from the hr script, which is under SRC control. hr will respawn if it, or the hrd daemon, is killed. hr and hrd run on the control workstation. Both the hr script and the hrd daemon are in directory /usr/lpp/ssp/bin. (The hrctrl script provides the same function as the hr script, but it follows the syntax of the syspar_ctrl command.)

The hrd daemon monitors the SP nodes using the Event Management subsystem. The hrd daemon acts as an Event Management client on the control workstation, using the EMAPI to subscribe for events pertaining to the IBM.PSSP.Membership.LANAdapter.state resource variables for the SP ethernet adapter on each node. Event Management notifies the hrd daemon when a node's adapters comes up or goes down and hrd updates the node's host_responds variable based on this value. For example, host_repondsis set to 1 when its SP ethernet adapter is reported as up and is set to 0 when its SP Ethernet adapter is reported as down.

The second method by which hrd can get node status is to use fping. fping is a program that runs ping to many nodes asynchronously. This second method also issues snmpinfo calls to each node, to detect a situation where ping succeeds but the node is not responding to user requests.

The hr script determines which method will be used. An environment variable named HR_FPING is set to 0 for heartbeat or 1 for fping. The default is 0.

The line that chooses between the methods looks like this:

typeset -x HR_FPING=0

After making a change to this file, you should reset the hrd daemon by issuing hr reset. This will cause it to respawn and the change will be put in effect. Other options for configuring hrd are documented in comments in the hr script (in /usr/lpp/ssp/bin).

The reason for the second method (fping) is that it has different characteristics than the system heartbeat. It does not require a daemon to be on each node, so it may reduce cpu usage on the nodes by a small amount.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]