Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Disk I/O pacing 1

Status
Not open for further replies.

rondebbs

MIS
Dec 28, 2005
109
US
We have an online system running a progress database with several hundred users on our p670 aix 5.2. We are migrating large file systems from an EMC cx700 array to an EMC dmx1000 array. When ever we do these copies using cp -p -R, our online systems nearly stops. All users are calling saying they are down or hung. I am looking at using the IO Pacing on "smit chgsys". On that screen there is a high water mark and low water mark to set so that the copies will not saturate the system.

Is anyone familar with these parameters? I was told to set high water mark to 513 and low water mark to 256. This is a crucial production system so I need to be carefull.
 
I have seen something similar to system hang by doing copies, but was never able to conclusively determine how to fix it. BUT...

If you are just migrating from 1 SAN system to another, why do any of it offline? Everything can be done online without the users ever knowing. If you have 2 (or more) paths to 1 SAN, drop 1 path and move it to the other SAN system. Pickup the disks from the new SAN system and 'mirrorvg -S' to "migrate" the data. After all LVs are syncd, then unmirrorvg from the old disks, drop the "old" path and swing that path to the new SAN as the redundant path. Using this high-level procedure, we've never taken outages (in AIX at least) to swap backend storage subsystems.
 
Interesting suggestion relating to mirrorvg. I am new to UNIX and have not used this command yet.

I have 4 hba cards in the host - 2 hba cards going to each array. I'm using EMC PowerPath and have 4 paths to the cx700 and 4 paths to the dmx. The cx700 is connected to the host with 2 brocade switches. The dmx is connected using to Cisco MDS switches.

When you say "drop one path and move it to the other SAN system" - I'm not sure how to do that. Can you give more info? This sounds like it could be a better process than using cp to move all the data.

Thanks - Brad
 
You might try altering the ioo/vmtune setting sync_release_ilock. The default is 0, but you can set it to 1 to not inode-lock the files on copying.

but look into the online migration (it's all done with mirrors!!!) too.


HTH,

p5wizard
 
If you have 2 paths to each SAN, then you're in much better condition than doing the migration with just 2 paths total. You will need an equivalent number of disks (assuming a constant disk size, say 18GB) down the new SAN paths that you have down the old.

Old Path: hdiskpower1 ... 3
New Path: hdiskpower4 ... 6

extendvg myvg hdiskpower4 hdiskpower5 hdiskpower6
mirrorvg -S myvg hdiskpower4 hdiskpower5 hdiskpower6
[wait until mirrors sync]
unmirrorvg myvg hdiskpower1 hdiskpower2 hdiskpower3
reducevg myvg hdiskpower1 hdiskpower2 hdiskpower3

After having migrated all data, then drop the old paths. Since you're using EMC, you'll need to look at 'powermt display paths' to determine which cards are old and new paths.

So, let's assume fscsi0 [dev=0] and fscsi1 [1] are the old paths and fscsi2 [2] and fscsi3 [3] are the new paths:

powermt remove hba=0
powermt remove hba=1
rmdev -Rdl fcs0
rmdev -Rdl fcs1

Drop the fibre then remove the cards.
 
Excellent, I have tested out the extendvg, mirrorvg, unmirrorvg, reducevg process and everything works well in my dev envrionment. My luns on the CX are 11gb but they are only 5.5gb on the dmx (the dmx bin file was created by emc). This does not seem to be a problem but I just need to add twice as many hdiskpower devices when I do the extendvg.

This process is great because I never need to unmount any file systems. I'm a little concerned regarding resources when I do this on my crucial production system. Hopefully extendvg, mirrorvg etc won't kill my production application. Any thoughts on that?
 
Nothing to worry about doing extendvg/mirrorvg/unmirrorvg/reducevg on production data, but you might want to initiate the mirrorvg at off-peak hours, because the synchronize process uses resources otherwise used for normal disk IO.

Also, you might run into a problem with the maximum number of physical volumes in a volume group (depends on what type of VG you have). If so, you may need to look into the LUN size again - see if you or EMC can define some volumes the same size as on your old box. And you may still need to go step by step (extendvg some PVs, mirror LVs on these PVs, unmirror these LVs, reducevg the emptied PVs, extendvg some more PVs, mirror, unmirror, reducevg, ...)


HTH,

p5wizard
 
This mirrorvg process has really worked out well. It seems to take some resources but not too many as the users are able to contiue working without noticing any slow down. I do see some additonal IO and a little drag on my CPUs.

Until now I have done my easyist volume groups. They are easy because in many cases there are only one or two logical volumes in the VG. Also there is very little acctivity on these VGs. Most of these smaller VGs can be mirrored in about 10 minutes.

I am now getting ready to do my most crucial VGs. Each VG will have multiple LVs and multiple file systems. Each file system will have somewhat heavy actvity (24/7) when I mirrorvg the VGs. My largest VG is 100 GB and has about 12 LVs and file systems on it. It will take over one hour to migrate. If users start complaining about response time I will likely need to kill the mirrorvg process right in the middle. As you suggested I will stay away from our busiest times.

I'm not thrilled with the idea of killing the mirrorvg process in the middle but I will have no choice if the system bogs down too much. Do you see any problem with this? I'm guessing the origional LVs on my old array will be intact after the kill but any mirroring to the new array will obviously not be completed. I can live with this as long as the origional VG, LVs etc do not get corrupted.

Thank - Brad
 
Why not use migratepv ? I used this recently on a prod system with over a 1000 active users and could migrate 100GB LUNs in 30-40mins.
 
Well, mirrorvg/unmirrovg or migratepv essentially the same operation, except that migratepv mirrors/unmirrors one physical partition at a time, whereas with mirrorvg/unmirrorvg you first mirror the whole VG and then you unmirror it. As to IO/CPU load it doesn't make a difference.

Take a look at the man page, you can instruct mirrorvg to not automatically synchronize the new mirror, then afterwards you can sycvg one LV at a time. If users start complaining, you can stop at the time when the LV you are currently syncvg-ing is done. then continue some time later. Also with syncvg you can choose how many PPs to synchronize in parallel. So start with 8 (default used by mirrorvg) and if users complain, try the next LV with 4, 2, ...
It may take longer this way, but your users stay happier.

mirrorvg -s <vg_name> <hdiskNN ...>
lsvg -l <vg_name> (all LVs will be open/stale or closed/stale)
syncvg -P8 -l <lv_name1>
syncvg -P4 -l <lv_name2>
syncvg -P2 -l <lv_name3>

As stated before - read the man pages first.


HTH,

p5wizard
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top