Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Excessive Hard Drive Failures 1

Status
Not open for further replies.

fitzdr

Technical User
Aug 2, 2002
39
US
I have over 120 servers, nearly all are using the internal hot plug storage.

The problem is hard drive failures, I have had 12 hard drive failures on various servers over the last 5 weeks, I have had 20+ failures since the beggining of the year.

These have all been primarily 18 GB ultra2 and ultra3 drives. One was a 36 GB ultra3. The servers and drives vary in age from 6 months old to 3 years old.

Prior to 9 months ago I might have had 1-3 drive failures per year, but this year looks to break the 30 mark. Power fluctuations are not the problem, we have extensive power conditioners and UPS's, all of which have not reported any problems.

Is anyone else experiencing this problem?
 
You'll probably find that the hard disks have not actually failed, but Insight Manager will report various SCSI errors (such as I/O errors or R/W errors) which lead the drive to be put in a failed status by the server. In some instances, the disk will be reset by a pull and a push (however I do not usually advise this).

Best thing to do is to ensure you are up to date with firmware on both disks, SCSI card and server. If you have a particular server that seems to be doing it more than others, run Compaq Diagnostics on it (found in the F10 partition) to see if any SCSI errors are detected. -----------------------------------------------------
"It's true, its damn true!"
-----------------------------------------------------
 
Firmware versions are up to date on the drives.

This has happen on 18 different servers ranging in age from 6 months to 3 years.

If I replace the drive counters for errors reset in CIM7 and everything looks great again. Though three of the drives that Compaq/HP has sent over the last 4 weeks have failed within 72 hours of installation. Replacing with a new drive usually does the trick.

(Why is Compaq/HP sending out faulty drives? one drive we recieved already had over 9000 hours.)
 
To answer your question in blue, Compaq/HP do not usually send out new drives - they recondition drives and recycle them that way. They should reset the error counters and check for errors before they send them out again.

You need to ensure that all firmware is up to date. We had the same scenario where a failed disk was replaced several times and replacement disks kept coming up as failed. The fix was up update all firmware on the server and this resolved the issue. -----------------------------------------------------
"It's true, its damn true!"
-----------------------------------------------------
 
I've found with some of the older drives that even if the firmware is up to date that there are problems with 'mixed' drives. That is drives from different manufacturers within the same array, eg WD and Seagate.

Hasn't caused massive problems but it is an issue and problem is that all Compaq will supply is a drive with the correct spares number, they won't give you spare part of particular manufacturer.

Neill
 
On the bottom of the disks, there should be a Compaq spares part number - usually in the format of 123456-001 (for example).

As long as you quote that number, there should not be a mixup. -----------------------------------------------------
"It's true, its damn true!"
-----------------------------------------------------
 
Reading NTINLIN's message more carefully, you would have seen that the problem isn't the spare part number but that if they send you a replacement drive using the spare part number you could end up with any one of several different manufactures.

If you had read my posting you would have noticed that I stated all firmware was up to date!
 
No need for attitude fitzdr - we're all friends here! -----------------------------------------------------
"It's true, its damn true!"
-----------------------------------------------------
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top