Sun's new 3310 HW Raid Array can be configured with dual Raid controllers and redundant SCSI channels between server and 3310. When combined with Veritas VxVM/DMP host software, automatic (transparent) LUN failover is achieved.
The 3310 Install Guide does not provide adequate information on how to configure the 3310, SCSI cabling and host software (VxVM) to achieve redundant path load-balancing and fail-over. Here is how I did it.
The config:
V480 Server
X6758A USCSI-160 HBA (2)
Solaris 8 2/02
VxVM 3.5
SE3310 Dual Controller Array (XTA3310R01A2R436)
Ken Wachtler
Midwave Corp
Chanhassen MN
~~~~~~~~~~~~~~~~~~~~~~~~~~
I. MULTI-PATH SETUP
-------------------
1. Cabled up for 2 SCSI channels between a single 3310-2Raid and a single host, as per Installation Guide pg 5-9. Note that the Install Guide does NOT show a single host with 2 SCSI channels as we have.
2. Assign both controllers to both host SCSI channels. This is an undocumented step (and unknown to Sun Tech Support). There is a hint provided in the Install Guide, page 6-10 "Creating Additional Host ID's".
Before:
Channel 1, ID 0 (primary controller)
Channel 3, ID 1 (secondary controller)
After:
Channel 1, ID 0 & 1 (primary & secondary controller)
Channel 3, ID 1 & 0 (secondary & primary controller)
3. Create a Logical Disk, as per Install Guide.
4. Assign Logical disk to Primary or Secondary controller, as per Install Guide.
5. Map the Logical disk to host LUNs, one for each host channel, as per Install Guide pg 6-27. The Install Guide notes "redundant path environments":
Install Guide Note:
"The same partition might be mapped to multiple LUNs on multiple
host channels. This feature is necessary for clustered environments
and redundant path environments."
6. Make VxVM aware of the 3310 Array multipathing characteristics, as per Installation Guide pg 6-32.
vxddladm addjbod vid=SUN pid="StorEdge 3310"
7. Standard Solaris LUN discovery steps, create/mount filesystem on c2t0d0.
II. MULTIPATH TESTING
--------------------
1. My script tar's up /usr/share/man, then writes it to a filesystem on the 3310 LUN, repeating indefinetely, with a unique filename each time.
2. "iostat -xcn" showed two channels actively load balancing
c2t0d0 20mb/s
c4t0d0 20mb/s
3. "Cable pull" test caused all I/O to pause, then after 1 minute it resumed, but only in short bursts, with 1 minute between bursts. Finally the Vx disk groups went offline, and a reboot was needed to clear it. My conclusion is that a SCSI cable pull is NOT tolerable during high I/O.
4. Restarted tests, then performed a "manual" controller disable (through 3310 firmware).
5. "iostat -xcn' shows that i/o resumes after a 1 minute, using the surviving path.
c2t0d0 0mb/s
c4t0d0 40mb/s
III. MULTIPATH TEST INTERPRETATION
----------------------------------
Initially it was thought that both 3310 controllers are writing to the same Logical disk, at the same time (#2). Then when a controller is disabled, the surviving controller takes all the i/o (#5).
After thought, I believe that c2t0d0 and c4t0d0 represent two front-end ports of the SAME controller, allowing DMP to active-active load balance between ports of the same controller (#2). When the active controller for the Logical disk is disabled, the surviving controller assumes the id (the t#) of the failed controller, in addition to the t# of it's own. I/o automatically resumes using a single port on the surviving controller (#5). Unfortunately, the 3310 multipath load balancing and fail-over are not documented anywhere.
END
The 3310 Install Guide does not provide adequate information on how to configure the 3310, SCSI cabling and host software (VxVM) to achieve redundant path load-balancing and fail-over. Here is how I did it.
The config:
V480 Server
X6758A USCSI-160 HBA (2)
Solaris 8 2/02
VxVM 3.5
SE3310 Dual Controller Array (XTA3310R01A2R436)
Ken Wachtler
Midwave Corp
Chanhassen MN
~~~~~~~~~~~~~~~~~~~~~~~~~~
I. MULTI-PATH SETUP
-------------------
1. Cabled up for 2 SCSI channels between a single 3310-2Raid and a single host, as per Installation Guide pg 5-9. Note that the Install Guide does NOT show a single host with 2 SCSI channels as we have.
2. Assign both controllers to both host SCSI channels. This is an undocumented step (and unknown to Sun Tech Support). There is a hint provided in the Install Guide, page 6-10 "Creating Additional Host ID's".
Before:
Channel 1, ID 0 (primary controller)
Channel 3, ID 1 (secondary controller)
After:
Channel 1, ID 0 & 1 (primary & secondary controller)
Channel 3, ID 1 & 0 (secondary & primary controller)
3. Create a Logical Disk, as per Install Guide.
4. Assign Logical disk to Primary or Secondary controller, as per Install Guide.
5. Map the Logical disk to host LUNs, one for each host channel, as per Install Guide pg 6-27. The Install Guide notes "redundant path environments":
Install Guide Note:
"The same partition might be mapped to multiple LUNs on multiple
host channels. This feature is necessary for clustered environments
and redundant path environments."
6. Make VxVM aware of the 3310 Array multipathing characteristics, as per Installation Guide pg 6-32.
vxddladm addjbod vid=SUN pid="StorEdge 3310"
7. Standard Solaris LUN discovery steps, create/mount filesystem on c2t0d0.
II. MULTIPATH TESTING
--------------------
1. My script tar's up /usr/share/man, then writes it to a filesystem on the 3310 LUN, repeating indefinetely, with a unique filename each time.
2. "iostat -xcn" showed two channels actively load balancing
c2t0d0 20mb/s
c4t0d0 20mb/s
3. "Cable pull" test caused all I/O to pause, then after 1 minute it resumed, but only in short bursts, with 1 minute between bursts. Finally the Vx disk groups went offline, and a reboot was needed to clear it. My conclusion is that a SCSI cable pull is NOT tolerable during high I/O.
4. Restarted tests, then performed a "manual" controller disable (through 3310 firmware).
5. "iostat -xcn' shows that i/o resumes after a 1 minute, using the surviving path.
c2t0d0 0mb/s
c4t0d0 40mb/s
III. MULTIPATH TEST INTERPRETATION
----------------------------------
Initially it was thought that both 3310 controllers are writing to the same Logical disk, at the same time (#2). Then when a controller is disabled, the surviving controller takes all the i/o (#5).
After thought, I believe that c2t0d0 and c4t0d0 represent two front-end ports of the SAME controller, allowing DMP to active-active load balance between ports of the same controller (#2). When the active controller for the Logical disk is disabled, the surviving controller assumes the id (the t#) of the failed controller, in addition to the t# of it's own. I/o automatically resumes using a single port on the surviving controller (#5). Unfortunately, the 3310 multipath load balancing and fail-over are not documented anywhere.
END