Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

HP Raid Five Disc Crash and Meaning of Disc Lights Flashing

Status
Not open for further replies.

SpanishWaiter01

Technical User
May 31, 2008
14
GB
Hi Group Members;
Can you help and explain?

I am a HP Proliant DL380 G3 user, where, I have five discs installed.

There are 3 discs set up in Raid 5 mode and two discs in Raid O mode.

The server is using the standard 5i emmbeded disc array controller.

Two of these three discs on the Raid 5 assembly were showing on the HP System Management Home Page software as being in "predictive failure" and therefore needed replacement. All discs are using Smart technology and are hot-swappable.

The System Management Home page software is HTML/web based but does not show the current status, colour and flashing patterns of any of the discs. All it displays the following status for a disc - OK, degraded, failed and percentage rebuilt.

One disc was removed to be replaced by another. In error, another disc was introduced to the system when the automatic disc recovery process had not finished.

This resulted in a system crash.

Now turning to my concern I have a belief that the System Management Homepage should be more descriptive and state more clearly disc and array controller activity status. You cannot see both at the same time.

Secondly, I believe it is a design fault/defect that you are able to remove discs from a server when the raid assembly is reforming (expanding; rebuilding)

Electrically controlled locking screws should fitted to the disc housings so to disallow any hard discs are being removed from a server when it is not safe to do so.

If you do "need" to remove a disc, surely you would bring it down gracefully first to remove this disc?


I cannot see why you would "want" to remove a disc when its not safe to do so and why does HP let this happen?

To me, its like a sea-going car-ferry, setting sail with its bow-doors open!

Any constructive comments are welcomed.

Cheers,
Spanish Waiter


 
once a disk has failed the RAID 5 has lost parity, that is, has lost redundancy. If a further disk is removed when there is no redundancy then you are removing actual data.

Only once the rebuild has completed can you remove another disk from RAID5.

Automated locking systems are fine but if it failed and locked in all of your disks you would be equally un happy. I think you have to correctly identify the disk before removing it, you can't really blame HP for that.

I don't really understand how a disk was removed before the rebuild had finished and also RAID 0 is not resilient, its just a stripe, you pull one of those and it will just fall over. If you pulled 2 of your 3 RAID disks it will also fail.

So basically, if you pull one disk out of all the disks you mentioned then your system is now non-resilient, pulling any more disks in your setup before the rebuild has finished will result in a loss of data.

To avoid this in the future you need to make sure rebuilding has finished and that you only pull one disk in a resilient array until the rebuild has finished(RAID5, not RAID0)

Hope this helps

Cheers
 
Sorry Spanish Waiter to hear that you've lost data. I know how that can suck. However, I am in agreement with Hondy here.

Not only that, but if you have a 2nd drive fail during the rebuild process in RAID 5, the data is also toast. Now you had two drives warning of failure. This is why it is advisable(imho) to not use RAID 5 unless you have data backed up regularly.

RAID is just another level of security-DR should include several layers just like parachutists have a spare one just in case.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top