IBM Books

Managing Shared Disks


Making your application recoverable

You can code your program to use the hc daemon of IBM Recoverable Virtual Shared Disk to aid in application recovery. A data management subsystem is the type of program that might use hc's services.

Your program should include the following hc.h header file:

#include <hc.h>

Your program connects to a socket where hc is listening. If a node fails or the client application on a node fails, hc sends a reconfiguration message according to the protocol defined in the hc.h header file. Your program should check for messages from that socket and react accordingly.

The hc subsystem can support multiple applications by running multiple instances of hc. Each hc is invoked by a different hc.vsd script.

To create a new instance of the hc subsystem, make a copy of the hc.vsd script (from usr/lpp/csd/bin) and rename it; for example, hc.myappname. Then edit the newly-created hc.myappname and change the INSTANCE variable to myappname. You should also export the path to the well-known socket for communication with your application, inserting the following line into the script:

export CLIENT_PATH=myappath/

where PATH is an absolute pathname. The hc subsystem then creates the socket myappath/myappname to communicate with your application.

As a convenience, the hc program runs an optional script:/usr/lpp/csd/bin/hc.activate once when it initializes (note that /usr/lpp/csd/bin/hc.deactivate automatically runs first). The hc.activate script is passed two parameters; the local node number and the path name of the UNIX domain stream socket that it serves. Your recovery program is started according to the value you gave INSTANCE. If INSTANCE=myappname, for example, hc calls myappname.activate.

If the script doesn't complete, hc won't communicate with the application. An application that checks connectivity through hc running on the other clients can't perform that check until the hc.activate script completes. If you need hc to continue processing, fork a background process from the hc.activate script. Then exit the hc.activate script with an exit code of 1, which indicates that communication with the application does not have to be complete for hc to continue processing. hc.activate is executed only once.

The hc program also runs the /usr/lpp/csd/bin/hc.deactivate script upon detecting failure of its client. The same parameters are passed as are passed to /usr/lpp/csd/bin/hc.activate, letting you restart the client if you desire. The client may fail either by exiting or by failure to respond to a ping message within 10 minutes.

Note:
Ten minutes is the default for the PING_DELAY. That is, if a client application fails to respond to a ping sent from the hc program within 10 minutes, hc will consider the application to have failed and will invoke hc.deactivate.

After hc.deactivate runs, it sends a node down message to the other nodes.

Some application programs running under some system loads may require a longer PING_DELAY timeframe. To change the PING_DELAY, edit the hc.vsd script. For example, if you wish to increase the PING_DELAY to 15 minutes, change the line in the hc.vsd script that begins export SCRIPT_PATH to:

export
SCRIPT_PATH=...PING_DELAY=900 

If hc fails, the application will receive a zero-length message over the socket and should shut itself down.

Applications are responsible for cleaning up system resources when they complete or fail; hc.deactivate should be used for this purpose.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]