Disk adapter failure handling by HACMP/6000
ITEM: RTA000095929
Could you please assist me with the HACMP/6000 capability against disk
adapter failure?
In the following sample configuration, please teach me how you make
HACMP detect the disk adapter(a) or cable failure and make system B
take over for system A (please assume this configuration as mode 1).
Or, if it's not possible by HACMP, why not?
-------O--O--------------------------------O--O------------- En
| | | |
------------ heartbeat ------------
| System A |o---------------------o| System B |
| | 9333 | |
| | ------------- | |
| -----| | | | |-----| |
------------ ------------- ------------
ANSWER
HACMP/6000 is not designed to directly handle disk or disk adapter
failures. These should be taken care of through careful hardware planning
and setup, and using the LVM mirroring facilities of AIX. Disks (and disk
adapters) are often viewed as more likely components to fail than CPU's,
so an HACMP cluster should always be designed with the shared volume
group(s) mirrored. This mirroring will be most effective if it is done
across separate disk strings (to automatically handle the failure of a
disk adapter) as shown in the modified diagram below:
-------O--O--------------------------------O--O------------- En
| | | |
------------ heartbeat ------------
| System A |o---------------------o| System B |
| | 9333 | |
| | ------------- | |
| -----| | | | |----- |
| | ------------- | |
| | | | | |
| | MIRROR | |
| | | | | |
| | ------------- | |
| -----| | | | |----- |
------------ ------------- ------------
9333
If you were going to use your original configuration, you could mirror
within your disk string, but would be exposed to a power supply failure
on the 9333 which would cause the cluster to fail. Since each 9333
adapter has 4 connectors, adding a second 9333 tower connected to the
single 9333 adapters in each machine, with disks mirrored across the
strings, would protect you from the failure of either 9333 tower. Using
the Error Notification facility of HACMP 1.2 and beyond, you could have
the system respond to an error log record of a 9333 adapter failure by
causing a node failure, and moving access to the disks over to the other
system with the good disk adapter.
S e a r c h - k e y w o r d s:
HACMP DISK ADAPTER FAILURE MIRROR MIRRORING
WWQA: ITEM: RTA000095929 ITEM: RTA000095929
Dated: 11/1996 Category: ITSAIHA6000
This HTML file was generated 99/06/24~12:43:32
Comments or suggestions?
Contact us