All,
I've been seeing some funky results when using a 3rd party SAN with a group of 6H1s and 6M1s. When importing a vg we'll occasionally see import times of 2 to 4 miutes. While this is going on i see port login messages being sent to the console of the brocade switch the host is talking to, i will also see the other LUNs in this configuration being polled sequentially and not in parallel (importvg will check all other known drives for consistency). If the response time is good on the import command the other LUNs are hit in parallel.
the other issue we have is loss of connectivity to imported drives once we try to do some i/o with them. errpt will see some entries about failed drives. while this is going on, we'll see 'port turned off' and 'loop down' msgs logged on the RAID controller.
the fun part is that we have not been able to isolate where the problem lies, we've tried various tests like connecting a host directly to the raid controller, and monkeying around with LUN ids, and settings for the HBA, etc with no luck. the problems don't occur consistenly enough to point to any single area as the culprit.
Our basic config is:
6 AIX hosts
2 brocade 3800's
2 CMD raid controller pairs with two physical fibre ports per pair
1 disk chassis, 3 raid groups per chassis
62 non-masked LUNs, ie all LUNs visible to all hosts
If anyone has seen similar behaviour and/or has suggestions it would be greatly appreciated!
I've been seeing some funky results when using a 3rd party SAN with a group of 6H1s and 6M1s. When importing a vg we'll occasionally see import times of 2 to 4 miutes. While this is going on i see port login messages being sent to the console of the brocade switch the host is talking to, i will also see the other LUNs in this configuration being polled sequentially and not in parallel (importvg will check all other known drives for consistency). If the response time is good on the import command the other LUNs are hit in parallel.
the other issue we have is loss of connectivity to imported drives once we try to do some i/o with them. errpt will see some entries about failed drives. while this is going on, we'll see 'port turned off' and 'loop down' msgs logged on the RAID controller.
the fun part is that we have not been able to isolate where the problem lies, we've tried various tests like connecting a host directly to the raid controller, and monkeying around with LUN ids, and settings for the HBA, etc with no luck. the problems don't occur consistenly enough to point to any single area as the culprit.
Our basic config is:
6 AIX hosts
2 brocade 3800's
2 CMD raid controller pairs with two physical fibre ports per pair
1 disk chassis, 3 raid groups per chassis
62 non-masked LUNs, ie all LUNs visible to all hosts
If anyone has seen similar behaviour and/or has suggestions it would be greatly appreciated!