Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Can you help with me with a Proliant 8000 and hard disk problems?

Status
Not open for further replies.

RoundEye

IS-IT--Management
Feb 23, 2002
3
US
For the life of me I can't figure out what's wrong with the damn thing!

Proliant 8000

Some background. The server has two 700MHz Xeons, two gigs of ram, twenty-one 18gig SCSI harddrives. Dual power supplies, on a huge ass UPS, in a climate controlled room, (not only is there central air, there is a big window A/C too)

At random, a hard drive or two will drop out of the raid, doesn't matter which bay the drive is in. Another tech and I have been on the phone for over a week with Compaq, and they are stumped too. At first they thought is was the raid card, so we bought a new one since the warranty expired this past March.

No such luck, same problem. So Friday I got on the phone with Compaq and talked to them about all the problems this server has had over it's life span. The hard drives have been failing randomly for over two years now. Now it's becoming a problem because it's set up with two raid5's. If two drives drop out, it can't rebuild the raid properly.

After talking to Compaq for a few hours, they decided to send me a new mobo under warranty. Which I though was decent of them, but it didn't fix the problem today.

So far we have tried to swap drive bays around (each one holds it's own backplane), different cables, new raid card, new mobo and twelve of the drives are brand new.

Each time I use Compaq's Smart Start, I delete the raids, recreate them and start to install the OS through the Smart Start, then at some part a hard drive or two will fail, randomly. It could be a drive that was good, and now is tagged as bad, or it could be a drive that was tagged as bad before I deleted and recreated the raid, and now it's good.

It doesn't matter which drive, what bay it's in or what cable it's on, they just go Red (an indicator on the drive which means it's defective). One time it could be bay 3, drive 4, next time is could be bay 1, drive 3 and bay two drive 8. Totally random.

I've tried setting up a raid in each bay with seven new drives, and each time a drive or two will fail in that bay (I've had as many as five out of seven drives fail in one bay). Never the same drive either (except for two drives which I know are bad, but I don't use those while troubleshooting).

The firmware has been updated on the server, raid card and drives. Could it be possible that all three backplanes on the drive bays have failed?

Any ideas, wild guesses, magic spells or voodoo potions are welcomed at this point. I feel like there's a curse on me. [curse]

Man do I ever hate it when an inanimate object, without a brain, beats me!

Thanks a lot,
Tobey

 
Is it always the same HDDs that fail?
If yes, then you could be looking at a Drive caddy problem.
If no, then it could be a drive cage backplane fault.

I would say that the M/B would not be at fault at all, neither would the raid controller as the other drives come up OK.

You need to isolate a bit more, to reduce the options for failure. this help identify the area of fault.

Hope this helps
 
Are you using the 4250ES controller?

I had a problem where the drives would start to report bad blocks - eventually locking up the server. It was found that I had two problems:

not enough power to the system

and the SA5300 controller

Once I used the 4250ES my problems went away.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top