Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

MPIO and DS6800/ESS 1

Status
Not open for further replies.
Jul 28, 2004
726
BE
Hi all,


We've got the following problem :

We have 2 ds6800 configured with san switches and fibrechannel, on AIX we've used mpio for connection to the disks.We've setup mirroring within AIX ( 2 copies for every LV , each residing on a different DS ).We've come across the following phenomenon :

When one of the mirrors is offline, normally the machine should nicely continue working on the other, but instead all operations performed to the disks just hang, get timed out,lsvg gives a message that it has been locked, some vg's even get corrupted . Anyone experienced similar behaviour with DS6800 or ESS.We've got a feeling that it could be the mpio, but for now we haven't got a clue how to solve it, and this weekend the machine was planned to be taken into production...

thx in advance

regards

RMGBelgium
 
Hello,

You must installed the SDDPCM fileset which replace the SDD fileset in AIX5.1.
This fileset can be downloaded from IBM site.( not installed in CDROM media 5.3)

The SDDPCM provide the faileover and load-balancing in vpathes ( hdisk replace vpath for access with ESS).

In your site, we had tested the scenario of replace online
a fiber Channel card ( fscxx ) without stopped applications
( without varyoffvg and unmount filesytem ).

In AIX51 with sdd 5.1 loaded in ESS environment, this replaced action imposed an halt of the applications



Documentation

Latest Multipath Subsystem Device Driver User's Guide
sc30-496-00 documentation


 
try setting fscsi<X> attribute "fc_err_recov" to "fast_fail" (needs a reboot of AIX) and see if it makes a difference.

IMHO, the default 'delayed_fail' produces an extremely long time to time out on a scsi command, because the driver then assumes the SAN and ESS/DS never goes down.
It is not so much an MPIO issue, but a FSCSI device driver issue.

As for SDDPCM, this is not strictly necessary, because AIX has a native PathControlModule. However the SDDPCM for ESS/DS may be more suited for ESS/DS LUNs. The AIX PCM will probably show the LUNs as "MPIO other FC"



HTH,

p5wizard
 
Also setting attribute "dynamic tracking" to "on" may be helpful in recovering from major SAN changes (I guess you removed the LUN assignments in one DS to simulate a DS down disaster).

chdev -l fscsiX -a dyntrk=yes -P

again: reboot necessary.

- or -

First unconfigure all FC disks (Available->Defined) and then change FC SCSI protocol devices and then configure SAN disks again (make Available again)

rmdev -l hdiskX (for every "Available" SAN disk)
chdev -l fscsiX -a dyntrk=yes
chdev –l fscsiX –a fc_err_recov=fast_fail
mkdev -l hdiskX (for every "Defined" SAN disk)

fc_err_recov: see prev. post - only if you have multiple paths per SAN disk.


HTH,

p5wizard
 
Hello p5wizard,

Thx a lot for the post , I've changed my settings for the fscsi's and all seems in order now !Thx a lot for your help!

regards,

RMGBelgium
 
You might also want to check that the policy for your MPIO disks is set to round-robin. I don't use MPIO (yet) only SDD, but see if
lsattr -E -l hdiskX
gives you any idea.

I've read that the default policy for AIXPCM is "fail_over", meaning that all I/O for a LUN goes through one FC adapter until the active path fails and then AIXPCM will select a different path.

As stated earlier, the SDDPCM may be more suitable because it is specifically written for ESS/DS type LUNs.

AIXPCM is there so that MPIO will work with "any" FC or SCSI storage box which is connected thru 2 or more paths.


HTH,

p5wizard
 
Really not my area but it sounds a bit like a quorum problem.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top