Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Clariion disk rebuilds then fails hot spare shows faulted

Status
Not open for further replies.

mjb23

Vendor
Jan 15, 2004
23
0
0
US
Hello,
I have replaced the same disk in a Clariion twice. Each time I replace the disk it rebuilds and then faults. I have gone through two disks and I will put in a third soon. The hot spare is also showing a fault but does not have the amber LED lit on the enclosure or disk.
My questions are:

Is their something else that needs to be done after replacing a disk? I never had to do anything else before.

What could the reason be for the hot spare to have a fault on it in the storage tab of Navisphere but not on the enclosure or disk led on the box?

Shouldn't the disk rebuild from the parity group if the HS has a fault?

Any ideas on what this could be other then multiple hradware failures?

Any suggestions?..other then call EMC. Because that will not happen.
 
You don't say what kind of Clariion. But.here's some thoughts. As odd as it sounds, a power supply can cause a disk to indicate a fault. I forget which is which but one supply is for the odd drives and the other for the even. This is on the older fiber boxes.. not the cx series.
It is possible that the upstream disk is causing the problem. I had a drive 3 fault keep coming back until I replaced drive 5.. path flow is odd drives first .then even drives so 9-7-5-3-1 next chassis.. all the way around the loop and then 8-6....etc. As for the hot spare... I know of some code issues.. if you're low on the flare rev, some disks have difficulty with the hot spare. I'm assuming that the hot spare is of adequate size. You are correct that the clariion should build from the raid 5 stripe. You can pull the hot spare and make that happen. Good luck.
 
More info:
Model - FC4700
running navisphere 6.5
Disk that is failed - Bus 0 Enc 1 Disk 2
Hot Spare Location - Bus 0 Enc 3 Disk 9
Lun 7 has faulted on SPB and is part of Raid Group 5.
Previous problems with rebuilding from hot spares on this system have occurred.

New info:
Bus 0 Enc 1 Disk 2 shows current state enabled but appears to have no activity.
Now waiting for Transition to clear from hot spare.
 
So, it appears that disk 0,1,2 is part of Lun 7 if I'm interpreting things correctly.
Since you are not seeing any activity on the new spare, it is probably in a hung state .. stuck in transition.
If the hotspare is still acting for the drive in slot 2, you should see activity on it that coincides with the rest of the stripe. You can verify that the hot spare is engaged by doing a disk summary. Right click on the top of the Navi tree in the storage tab and select disk summary. If it were my box, I'd yank the hot spare and see if the drive in slot 2 starts building. There's a lot of real estate between the failed drive and the hot spare.. lcc's, other disks that could be marginal. EMC has a program that can check the back end bus for those hidden errors. You may even have some obvious clues to the problem in the sp error log. A lot of fiber up /down node fault issues. I'd suggest looking at that. Right click on the sp that owns the stripe and look at the event log.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top