I work with MrStress, and although one would expect simply replacing a single failed disk in a RAID5 set would work - it didn't.
This happened on two different occassions, granted on one of them, a second disk failed during the rebuild (D'oh). But with only one dead disk, after replacing it, calls were coming into the HelpDesk with corrupted files messages. The entire RAID5 set had been corrupted. Now our SOP is to down the server to replace a "hot swappable" drive when it fails.
I wish that we could convince the powers that be to let us revert back to UNIX, at least that way, you get good, qualified tech support who understands what "production server" means, and on-site engineers instead of what we affectionately refer to as "parts monkeys", who come in to replace a singe part, leaving you to go through the entire rig-a-marole again for the next parts swap.
As for the (very) high incidence of HDD failures, these are temperature, power controlled data centres, with full alarming on abnormal conditions, and logs of power issues, and we've had no indication of anything that could be causing these drives to die. The only curious things it that all the failing disks are IBM's, and we're getting the replacements as everything but IBM!