Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

16 Drive RAID5 array failing drives monthly...

Status
Not open for further replies.

Aknip

MIS
Apr 12, 2005
23
0
0
US
I have a server running Windows 2000 Server Sp4, Motherboard: Intel S7501HG0, 2-2.8GHz Xeon's, 2GB RAM, and 2 Adaptec 2200S RAID controllers (64MB cache each) setup as follows:

Controller 1: 2 external enclosures with 8 Seagate 68GB U320 drives. Each enclosure is on its own channel. These 16 drives make up 1 RAID5 array. This array is configured as 1 logical drive that only holds data files for an SQL database. There are hundreds of thousands of files most under 1 MB on this array.

Controller 2: 6 34GB Seagate U320 drives setup in a RAID10, and 1 34GB Seagate U320 standalone drive. The RAID10 is broken into 5 logical drives which hold the OS, SQL database & logs.

All of the hard drives have the same version of firmware.

The problem I'm having is only with the RAID5 array. Every month or two it marks a hard drive bad. We replace the drive with a new one, it rebuilds and we're fine again. Are these drives really bad, or are the I/O's so high with that many drives, and there isn't much cache so the drives can't keep up, so it marks a drive as failed? We have been over this with Adaptec and we get conflicting answers. Some people say the ideal amount of drives in a RAID5 should not exceed 8 drives. Others have said since we are using 2 channels for these 16 drives (8 to a channel) we should be ok. I just wanted to get some other opinions.

Thanks for any input!
 
With raid 5 there are two main failure , bad drives and bad sectors, either one can cause the controller to fail a drive. Then with "replacement drives", which are refurbs or recerts, you have a big chance of receiving a drive which failed in a previous raid, was tested, then sent to you; raid adapters are strange creatures, they will fail a drive which will pass multiple SCSI tests.

First I would ensure you have a stable raid bios, search google for the particular raid card/bios firmware. The newest bios is not always the most stable.

Are these drives cooled properly. Raid arrays do not like high temperature or temperature variants. Clients with non air conditioned rooms, even with sufficient airflow to keep the drives a few degrees above room temp I see a significant increase number of "offlining" and drive failures in the summer months where the rooms get to 90 degrees. Heat is the biggest threat to arrays. If the drives remain at temps more than a few degree above room temp, I would get different fans which produce a high airflow. All my arrays remain about 3 to 5 degrees above room temp. Assure there is not dust build up.

Are you sure your power supply feeding the drives is providing ample power, is you battery backup unit functioning correctly.

........................................
Chernobyl disaster..a must see pictorial
 
So do you think our drives are actually failing? At one point we were not able to get a new drive right away so we put the supposedly failed drive in another system and Zapped the drive (hidden adaptec utility that removes any RAID data, closest thing to a low level format). We then put the drive back into the system, the array rebuilt and the drive has been running fine ever since.

There is a firmware update for our card that was released in December. It’s supposed to “improve build times and should show some performance increases as well.” I think I’ll apply this update this weekend.

The server is in a controlled environment, 66 degrees, 55% humidity. The inside of the external enclosures is 68 degrees.

Each enclosure has two power supplies, there are 8 drives in each, and the power supplies are 460Watts each.

All of the drives are on the same firmware level.

What are your thoughts on RAID50? If we purchase another external enclosure with 8 drives, setup 3 8-drive RAID5 arrays, and then Strip them. We would gain 410GBs, we would still be able to have 1 logical drive, we would have no more than 8 drives in a single RAID5 array, and we would technically be able to lose 3 drives at one time. I’ve never setup a RAID50, is it possible to do this with 3 controllers, or would we have to find a card with 3 external channels?
 
Pardon the rambling...

Nice temperature for a raid environment.

Could be the firmware on the adapters, I would get the December issue. Sounds like the raid adapters are not really failing all the drives but "offlining" some of them, due to soft errors or some other issue. Really failed drives can not be resurrected.

I see you have Seagate drives, there is an issue with u320 drive firmware revisions below 005 or 006, I just went to Seagate .com, and I found the issue white paper (search firmware problem) but it was only in German. Had this on one array, but I had problems with updating the drives to the newest firmware,I lost an array using Seagate's Enterprise utility to update the firmware.

I use Lsilogic adapters, have not used Adaptec controllers in years, but I have not heard of any adapters which will support arrays across three raid controllers.

I see you use raid 10, and you have the resources. I would strictly use raid 10, with a hot spare, or a global hotspare if you maintain multiple arrays on a controller.

Raid 10 is far superior to raid 5, especially in the "write" throughput. As far as safety, superior to raid 5 , as it sustains a minimum of 2 drive losses. Another area of safety, raid 5, once degraded is far more susceptible to another drive loss due to bad blocks, which is an increasing risk with larger drives and with increasing numbers of drives in an array. Raid 50 is better than raid 5, as far as safety and throughput.

raid 555
raid 50

As far as adapters, I would recommend a 4 channel raid card, as more than 5 or 6 u320 drives on a channel saturate the SCSI bus (no more throughput increase over this amount).
Your board seems to be PCI-x at 133, you raid is a 64 bit raid at 66 MHz PCI bus, a miss-match which will not give the highest throughput.
Another advantage of a 4 channel adapter versus two adapters is the use of IRQ resources. Two adapters will require 2 IRQ interrupt from the CPU, a 4 channel will require 1 when needed. Every IRQ request slows raid performance somewhat.


A 4 channel Lsi adapter

Benchmarks of raid adapters in Dutch, but the graph are in English. I am not Dutch but trust me it is worth the look


460 watt power supplies are more than enough.

While your at it, update the motherboard bios. Disable any unnecessary devices on the motherboard. Update you network card drivers. Look into disabling "SMB signing", if security permits, as this can cause slow network, and delays writing to disks.

........................................
Chernobyl disaster..a must see pictorial
 
Well, I updated the firmware on the adapters, and updated to the latest version of the windows device driver, it was just released this month. Hopefully this helps.

How important is the amount of cache? These cards only have 64MB each. I don’t think that seems like enough. Our Dell PowerEdge 2800’s each have 256MB of cache on their RAID controllers.

The drives are all on firmware level 0007. Seagate’s Enterprise utility sounds like some good software if you want to start over!

RAID10 is awesome. I’m using this for the OS and SQL Database & Log. The SQL database is 40GB plus the 1GB log. This used to be on RAID5, once I moved it to the RAID10 we could see a huge performance increase. The current RAID5 array is 1TB. I would love to have this be a RAID10, but we would have to purchase twice as many drives. This is why I suggested the RAID50. We would still need to purchase 8 more drives, but we would still be gaining 410GB. If we could add this extra space the server should be able to last at least 2 or 3 years. After that, I think we’ll have to move to a SAN.

That LSI card looks good, but wouldn’t I need a card with 3 external channels if I add another enclosure? Or could I put 8 drives on one channel and 16 on the other?

 
Addition...I see that card does have 4 external channels. Would you recommend the Battery - 256MB? Would this make the total cache 384MB?
 
The battery is considered absolutely necessary, though I have used raids without them on my office raids for years without data loss; with power losses and illegal shut downs. All my client's have batteries on their raids. Where the battery comes in is in a power loss during an in progress write, which could save you from corruption.

With a 4 channel adapter I would use the maximum amount of ram the adapter can handle. General rule from experience, 64 meg is too little, over 128 Meg does not produce dramatic throughput increase. In your case with the size of the array it might have descent affects. Make sure you get the exact recommended memory, no deviations from the manufacturers recommendation, otherwise big problems can result. On most adapters I have worked with their was a considerable affect from increasing to at least 128. If your talking about the LSI u320-4x, 512 is the max, for the price of ram I would go the max, in your case.

As far as Seagate's software...
In the past I have updated firmware on about 100 drives, never lost data, this situation was the first. What piss me off was the first drive was a hot spare drive, removed in the raid setup as a hotspare then taken off line, so you would figure what could go wrong..as soon as I replaced it after the firmware flash, it blew the array...go figure. Definitely recommend it for use on raid 5 arrays, if you want rebuild practice. As a note , as a result of this array loss, I did a straight, no sleep, 36 hour day then got 8 hours sleep, then did a 32 hour stint to get this server back, due to multiple SQL instances..was loads of fun...if there was a highway nearby, I would have walked in front of a Semi towards the end of the 32 hour stint.

Problem with loading more loading more than 5 or 6 u320 drives per channel is you loose throughput, because the scsi channel can not handle anymore data from drives once it hit 320 Meg/sec (minus channel overhead). Then you have arbitration overhead due to each drive demanding a bit of time sharing the bus.... something to consider for your next raid server.

Increasing raid cache ram amount....
But look at the Adaptec with a ram increase, useless, strange

Then look at the Lsilogic u320-2x, I see less increase in throughput with my testing for a general use server but it depends on the how the raid parameters are setup, and if the same data is repeated accessed, also depends how the arrays are tested
512 Meg cache

256 meg cache

128 Meg


Raid 50 speed increase, looks like decent alternative....


6 drive raid 50..

6 drive raid 5

All in all, Lsilogic makes a damn fast raid

Just checked out a site yesterday, the Dell 2800 is a top performer. New inherited client has one with the Perc4E DI (Lsilogic equal to the U320-2E), bloody fast raid 5, with only 3 drives. HD Tach benchmark, 124 Meg/sec average.

........................................
Chernobyl disaster..a must see pictorial
 
The majority of the files on this array are arround 600KB. The stripe size for the RAID5 is 64K. So I'm I right by thinking to write one 600KB file it would only use 9.3 drives leaving the other 6.7 drives with a blank 64K block? So not only are there too many drives in this array (high I/Os), but there is also a lot of wasted space and fragmentation. I finaly got the software vendor to give me some hardware specs, they recommend a 16K block size on new servers.
 
Sorry for the late answer

Not so...

Raid blocks are not like disk cluster allocation. There is no wasted space on a raid. Not to say OS file systems do not waste space, dependent upon the disk cluster size.

Blocks are filled with data across all drives, evenly. The data is not written to a block on one disk, the remainder filling the block on the next disk, so on and so on. If this was so their would be no performance increase as the number of disks in an array increases. Stripe size comes into play in performance as the raid adapter reads or writes. The adapter must find data, if the stripe is small their is overhead due to the data being in multiple blocks, the drive heads need to move more often, but can find the data within a smaller stripe block faster. With larger stripe block sizes, a group of bytes relating to a file will more likely be within a larger block size, but it is also more likely, data not relating to the file data being sought is also in a larger block size, creating overhead.

Even if you know the average size of the file and adjust the block size, there are many variables to performace..write thru vs. write back, read ahead vs. non read ahead, cache size, the raid manufacturers algorithm.
etc. Generally small block sizes for a dedicated database servers including Exchange as the data size is small, larger for general use servers.

........................................
Chernobyl disaster..a must see pictorial
 
Correct me if I'm wrong, but I always thought Stripe Size and Block Size were the same thing. I just looked in the SMOR of an Adaptec 2100S, under an array it shows the Block Size: 512bytes and the Stripe Size: 128Kbytes

Any ideas?
 
Block and Stripe size are the same. Of course IBM and Microsoft constantly change accepted terminology all the time.



Believe they are referring to a a que where the data is held until a write or read ( cached ), 512k comes up in this link

........................................
Chernobyl disaster..a must see pictorial
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top