AIX Tip of the Week: Minimizing Downtime to Move Large Data Sets Between Storage Devices

This tip offers several AIX techniques to minimize the downtime required to move large data sets between storage devices. This tip originated from a customer situation where it took an extended outage (over 24 hours) to move a TB of data from one storage device to another using a single “cp –R” command.

Here are a few of my suggestions to improve their copy time in the future. In order of preference:

Use SAN/Application tools when available
Multiple copies in parallel
Migrate
Split mirror copy
Point in time copy, followed by an incremental copy

Discussion

SAN/Application Tools: My first choice for copying bulk data is to use SAN or application specific tools when available. In general, these tools do a better job of preserving data integrity, have better performance and can reduce the load on the server. However, this is an AIX tip, so here are some AIX specific techniques.

Multiple Copies in Parallel: This involves running multiple copies in parallel. For example, running two copies in parallel might reduce the copy time by half. Running three copies in parallel might reduce the time to one third, and so on. For example, instead of running a single copy

# cp –pR /source/* /destination

run multiple copies

# cp –pR /source/dir_a/* /destination/dir_a

# cp –pR /source/dir_b/* /destination/dir_b

There are a couple limitations with this technique. First, it does not work well if the data is unevenly distributed through a file system. Second, it also requires the application to be stopped (or run in “read-only” mode) for the duration of the copy. Finally, multiple copies will degrade performance if the data resides on the same physical disk.

Migrate: the AIX “migratepv” or “replacepv” commands can move physical partitions from one device to another in the background while running production. This has the additional benefit of not moving the location of the data, as the copy command does. For example, the following command will move the contents of hdisk1 to hdisk5 and hdisk5.

migratepv hdisk1 hdisk5 hdisk6

Alternatively, “migratepv” can move individual logical volumes. The following command moves “datalv” to hdisk4.

migratepv –l datalv hdisk4

The technique requires source and destination hdisks to reside in the same volume group.

Split mirror copy: This involves mirroring the source LV on the destination LV. Once the mirror synchronization is completed, the application is temporarily suspended while the source LV copy is removed. The mirror synchronization can run in the background during production.

The following example creates a mirror copy of “datalv” on hdisk1 to hdisk2, synchronizes the mirror, then removes the original copy.

# mklvcopy datalv 2 hdisk2

# syncvg –l –P6 datalv

# lslv datalv # verify sync is complete…no “stale” partitions

…Although not required, I recommend stopping the application to be safe….

# rmlvcopy datalv 1 hdisk1

The technique requires source and destination hdisks to reside in the same volume group. Also this technique will take longer to synchronize than a straight copy, as all PP’s are copied (including empty PP’s).. For more information, see

Snapshot Filesystems

Point in time copy: This technique copies files in the background during production. Then later, the application is shutdown and only the changed files need to be copied. For example

# touch timestamp (create a file for a “timestamp”)

# find . –print | cpio –pld /source /destination

…..shutdown application to copy changed files…..

# find . –newer timestamp –print | cpio –pld /source /destination

A better command is GNU’s “rsync” command, which can be found on the Linux Toolbox CD. The syntax is:

# rsync –av /source/foo /destination

….shutdown application to copy changed files…..

# rsync –av /source/foo /destination

(The “rsync” command has the additional benefit of being able to copy between servers across a network, using a secured connection, and with compression to reduce traffic.)

If you’re worried about copying “open” files, you can take stop the application, take a snapshot of the JFS2 filesystem using the “backsnap” command, resume the application, and copy the snapshot.

This technique doesn’t work well if the changed files are large. This is simply because it takes a long time to copy large files.

Summary: You choice of technique depends on your situation. In real life, you will use some combination of the above techniques. The choice depends on the configuration and the objective of the copy. Some of the considerations include:

Architectures of the source and destination server
Logical volume type of source and destination (raw, JFS, JFS2)
Volume group configuration (PP size, number of disk drives)
Architectures of the source and destination storage
Data type (regular files, sparse files, database files)
Impact of the background copy on performance

As always, no “TIP’ing (Testing In Production). Test thoroughly before implementing in production.

Appendix: AIX Tuning to Improve Copy Performance

To improve copy performance, be sure to tune the “readahead” parameter of the “ioo” command. There are two “readahead” options; “maxpgahead” for JFS and “j2_maxPageReadAhead” for JFS2.

# ioo –o maxpgahead=’32’

# ioo –o j2_maxPageReadAhead=’128’