RE-SYNCHRONIZING STALE COPIES AFTER A DISK FAILURE.
ITEM: RTA000022527
QUESTION:
I have a model 550 with 3 9334's attached to my system. Each
of the disks is attached to a separate SCSI controller. On each disk
is a copy of a very large database - in essence, triple mirroring.
I have researched how the disks are re-synched in case of a failure of
one a physical disk. I need to know how the sync process works - in
other words, if the system is asked to re-sync the new disk once it is
brought back online - and a large database is involved - how much system
overhead can I expect this process to take? Will the process take all
system resources to re-sync the disk or is there a way to control the
process in such a way so as not to impact the user's operations? Is
there such a control in the new version?
---------- ---------- ---------- --------- ---------- ----------
A: If a disk fails and copies exist on that disk, then the other
available copy of the data is written, and data is updated in the
volume group descriptor area (VGDA) on the available disk indicating
that the physical partition on the failed disk is "stale". As
data is attempted to be written to different physical partitions on
the failed disk, each partition is marked stale.
When the volume group is re-varied on, or if the "syncvg" command is
run, the VGDA is always checked to see if stale partitions exist. If
so, the copy is read from the most recently written copy and then
written over the stale partition on the recovered disk. In this
manner, the system needs to resync ONLY the partitions on the
recovered disk that differ from the most recent copy.
Each physical partition on the disk is 4 MB by default. This means
that to resync a single partition, you must read 4 contiguous MB from
one disk and write that same data to the stale PP on the other disk.
The amount of time it takes to resync is obviously this amount of
time multiplied by the number of stale physical partitions. I can't
tell you the exact time it takes to read and write 4 MB of data
between disks, because this depends on the type of drive, the
SCSI connection, other CPU or micro-channel activity, etc., but this
should give you some idea.
The resynchronization process will not halt other system activity,
but will cause a great deal of I/O activity, possibly affecting the
performance of other processes. If you wish to postpone the sync
process, you can varyon the volume group after returning the failed
drive using "varyonvg -n VGname". The "-n" flag tells varyonvg not
to run the sync process. You may then run the "syncvg -v VGname"
command to resynchronize stale partitions at your conveniences, or
possibly using a "nice" value that will less impact other system
activity.
---------- ---------- ---------- --------- ---------- ----------
This item was created from library item Q574480 BCXLP
Additional search words:
BCXLP COPICS DASD DISC DISKETTE DRUM FAILURE HARDWARE IX LVM MAR92
RE RESYNC RISCSYSTEM RISCSYSU STALE SYNC SYNCHRONI SYNCHRONIZIN
SYNCVG SYS SYSTEM UNIT VARYONVG
WWQA: ITEM: RTA000022527 ITEM: RTA000022527
Dated: 07/1996 Category: RISCMGMT
This HTML file was generated 99/06/24~12:43:08
Comments or suggestions?
Contact us