Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Raid 5 Failure

Status
Not open for further replies.

dbomrrsm

Programmer
Feb 20, 2004
1,709
GB
Maybe wrong forum and appologies if it is.

I have a RAID 5 set up and one of the disks has failed - anyone know of the best software to use to:

1 Find out which of the three disks has failed

2 Repair the problem when the damaged disk is replaced

Any help is appreciated

DBomrrsm
 
1. Why use software? Just look at the lights on the hard drive. Are they Green or Amber? If it's amber then that's the bad one. I check the lights on every hard drive I have on a daily basis, just so I can 'catch' bad hard drives.

2. Repair what problem? You replace the hard drive and the RAID rebuilds itself. That's the purpose/function of RAID 5.

And finally, yes this is an inappropriate forum for your question. I don't know if there's a forum for PC hardware issues, but you could use the forum search to find any.

-SQLBill
 

I agree 110% with Sqlbill.

The question comes down to - how do you know the drive failed without looking at the lights? via windows disk manager?
 
OK Thanks - sorry Im new to this problem.

In windows disk manager I can see the three "disks" that make up the RAID 5.

One is showing as online - the other two are shown as online but with errors - they have a small yellow rectangle with a black exclamation mark in it.

Any advice ?

TIA

DBomrrsm
 
Is your system still fuctional?

Cause on a RAID 5 configuration if 2 out of the 3 disks are dead the system should be dead as well. RAID 5 is structured so that if 1 disk fails you should be able to just swap it out... but if 2 disks fail at the same time I thought you were dead in the water

Can you clarify?

Thanks


DBAWinnipeg
 
Havent got a clue. The three drives seem as one in windows explorer - this is showing as not there.

In disk manager the three seperate volumes can be seen and only one has now got a yellow tirangle with an exclamation mark in it although this is still showing as online in the left pane of the manager but failed in the right half.

The other two are showing as online with no yellow triangle on the left but failed on the right !!!!

I am lost !

DBomrrsm
 
OK if I'm understanding you correctly then it sounds like your 2 disks are still ok and the 1 is dead.

Do you have a spare disk?

If you RAID 5 is setup properly you should be able to pull out the bad disk and just swap in a new disk as long as the new disk is the exact same configuration as the disk you are pulling out.

Hope this helps

DBAWinnipeg
 
some advice above suggests that if a disk is gone then it shows as amber ?

I have just restarted the server and as the boot checks all the discs individually all the green lights for each disk come on and it gets through the boot succesfully ?

Thanks for your comments DBA Winnipeg.

DBomrrsm
 
As a precautionary measure I would suggest doing some diagnostics on the disk. 99.9% of the time disks don't show as failed unless there is a real problem.

Hope everything checks out ok for you


DBAWinnipeg
 
any quick suggestions as to the best diagnostics to do ?

thank god I live in england - its home time soon - I need to go - one hell of a day !!!

Thanks DBAWinnipeg.

DBomrrsm
 
Your Hardware vendor typically has a disk management tool that you can use to better trouble shoot the problem. Dell has the Dell Fast utility or Open Array manager from veritas depending on the storage type and controllers used. HP and Compaq have the Smart Array Manager. I'm not sure about the HP and Compaq tool but I do know that the Dell Fas utility can sometimes fix the problem, but you would still want to replace the drive. Dell's fast utility will let you clean orphaned partitions and rebuild the drive. This removes the data from the bad sectors on the disk which should last until you get a new drive installed. I wouldn't allow anylonger than 24 hours before a new drive was in place, given my choice I always order servers with gold (4 hour support) which has paid for itself a number of times.

"Shoot Me! Shoot Me NOW!!!"
- Daffy Duck
 
Lastly, the problem might not be the hard drives. With RAID you have a RAID controller. If that is bad, it will look like one or more drives are bad. I learned the hard way, if you have multiple hard drives go 'bad' at once (this is very rare), first check the raid controller.

Your hardware vendor can help you with this.

-SQLBill
 
Also be aware on multi channel cards that you are looking at the drives on the proper channel.

Example a 4 channel PERC (dell) controller card has 4 external nodes to plug into but it also has 2 internal nodes. If you have channel 0 assinged to internal drives and someone plugs into the external channel 0 port then you could be in trouble. The array will probably work until a drive fails, then it will report a failed drive from the wrong array.

I swear I didn't configure the server like this I inherited it, and it cost me the loss of a 250gig Data warehouse and 7 days down time.

"Shoot Me! Shoot Me NOW!!!"
- Daffy Duck
 
Thanks everyone for yout input.

It turned out to be a bad disk and when replaced the RAID array rebuilt itself and I am back up and running

Thanks again.

DBomrrsm
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top