I have a P570 with 2 FC adapters, each connected to their own McData 4500 switch which is in turn connected to a controller.
HBA0 --> switch 1, port 0 --> controller A
HBA1 --> switch 2, port 0 --> controller B
I have a couple of luns, hdisk3 and hdisk4 both about 10gig.
When I assign (ownership/preferred path) the logical volumes to controller A, my system performance tanks. When you do a 'iostat 2', you will consistently see %tm_acct near or at 100% with no disk activity. When you assign the lv's to controller B, no disk issues.
I looked at the switch error logs and there is nothing about errors on any of the ports, nor is there any errors on the SAN or in AIX. With that said, i have to feel that something is definitely wrong with the path to Controller A.
Here's my list of physical things that can be wrong:
1. Bad HBA
2. Bad Cable between server and switch
3. Bad SFP for server in switch
4. Bad SFP for SAN in switch
5. Bad cable between switch and SAN
6. Bad SFP in SAN
7. Bad controller in SAN
8. Bad switch
So far, i have replaced 2,3 and 5 with no change, things are bad on controller A path and good on controller B path.
This morning I moved SAN controller A connection to a different port in the switch. I re-ran cfgmgr -v and waited (a very, very long time). The lcd screen on the P570 displayed '0538' and cfgmgr was stuck at '/usr/lib/methods/fdarcfgrule'.
After 10 minutes, it finished and i checked the error logs and found:
B8FBD189 0228065708 T S fscsi0 SOFTWARE PROGRAM ERROR
825849BF 0228065708 T H fcs0 ADAPTER ERROR
D9770360 0228065708 P H dac0 ARRAY OPERATION ERROR
B8FBD189 0228065708 T S fscsi0 SOFTWARE PROGRAM ERROR
825849BF 0228065708 T H fcs0 ADAPTER ERROR
B8FBD189 0228065708 T S fscsi0 SOFTWARE PROGRAM ERROR
825849BF 0228065708 T H fcs0 ADAPTER ERROR
B8FBD189 0228065608 T S fscsi0 SOFTWARE PROGRAM ERROR
825849BF 0228065608 T H fcs0 ADAPTER ERROR
B8FBD189 0228065608 T S fscsi0 SOFTWARE PROGRAM ERROR
825849BF 0228065608 T H fcs0 ADAPTER ERROR
B8FBD189 0228065508 T S fscsi0 SOFTWARE PROGRAM ERROR
825849BF 0228065508 T H fcs0 ADAPTER ERROR
825849BF 0228065508 T H fcs0 ADAPTER ERROR
So, thinking i had a bad SFP, i swapped it with another one. Same result. I moved the SAN controller connection back to it's original port and re-ran 'cfgmgr -v'; finished in 3 seconds and no errors in aix.
Can you not swap SFP's like that? I did for the one that the P570 attaches to and things worked fine but swapping either the SFP in the SAN controller or the SFP in the switch, i get errors and AIX (or any other host that uses controller A) cannot see their disks.
Thanks for any help!
HBA0 --> switch 1, port 0 --> controller A
HBA1 --> switch 2, port 0 --> controller B
I have a couple of luns, hdisk3 and hdisk4 both about 10gig.
When I assign (ownership/preferred path) the logical volumes to controller A, my system performance tanks. When you do a 'iostat 2', you will consistently see %tm_acct near or at 100% with no disk activity. When you assign the lv's to controller B, no disk issues.
I looked at the switch error logs and there is nothing about errors on any of the ports, nor is there any errors on the SAN or in AIX. With that said, i have to feel that something is definitely wrong with the path to Controller A.
Here's my list of physical things that can be wrong:
1. Bad HBA
2. Bad Cable between server and switch
3. Bad SFP for server in switch
4. Bad SFP for SAN in switch
5. Bad cable between switch and SAN
6. Bad SFP in SAN
7. Bad controller in SAN
8. Bad switch
So far, i have replaced 2,3 and 5 with no change, things are bad on controller A path and good on controller B path.
This morning I moved SAN controller A connection to a different port in the switch. I re-ran cfgmgr -v and waited (a very, very long time). The lcd screen on the P570 displayed '0538' and cfgmgr was stuck at '/usr/lib/methods/fdarcfgrule'.
After 10 minutes, it finished and i checked the error logs and found:
B8FBD189 0228065708 T S fscsi0 SOFTWARE PROGRAM ERROR
825849BF 0228065708 T H fcs0 ADAPTER ERROR
D9770360 0228065708 P H dac0 ARRAY OPERATION ERROR
B8FBD189 0228065708 T S fscsi0 SOFTWARE PROGRAM ERROR
825849BF 0228065708 T H fcs0 ADAPTER ERROR
B8FBD189 0228065708 T S fscsi0 SOFTWARE PROGRAM ERROR
825849BF 0228065708 T H fcs0 ADAPTER ERROR
B8FBD189 0228065608 T S fscsi0 SOFTWARE PROGRAM ERROR
825849BF 0228065608 T H fcs0 ADAPTER ERROR
B8FBD189 0228065608 T S fscsi0 SOFTWARE PROGRAM ERROR
825849BF 0228065608 T H fcs0 ADAPTER ERROR
B8FBD189 0228065508 T S fscsi0 SOFTWARE PROGRAM ERROR
825849BF 0228065508 T H fcs0 ADAPTER ERROR
825849BF 0228065508 T H fcs0 ADAPTER ERROR
So, thinking i had a bad SFP, i swapped it with another one. Same result. I moved the SAN controller connection back to it's original port and re-ran 'cfgmgr -v'; finished in 3 seconds and no errors in aix.
Can you not swap SFP's like that? I did for the one that the P570 attaches to and things worked fine but swapping either the SFP in the SAN controller or the SFP in the switch, i get errors and AIX (or any other host that uses controller A) cannot see their disks.
Thanks for any help!