Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

RIAD 1 crash on SBS2003... 2

Status
Not open for further replies.

wahnula

Technical User
Jun 26, 2005
4,158
US
Hello,

I posted this w/ no replies on HDD forum, it's getting late and I would love to hear from someone.

Server is SBS2003SP2 w/ SATA RAID 1 for OS & Apps, SATA RAID 5 for data, MB is Asus K8N-DL, 4GB RAM, all drives are SATA Raptors except for (1) IDE drive for backups.

I came in today to a locked-up server...first time this has happened since it was built by me in 2005. Upon hard reset I discovered my RAID1 array was 'degraded'...I let it pass that point and I received a MACHINE_CHECK_EXCEPTION STOP: 0X000009C with the usual 4 sets of ridiculously long numbers. I checked my log (yes I write down every single thing that involves the server) and found the exact same set of numbers when I had problems installing the IDE drive 1 year ago. Eventually it installed OK, so I rebooted the server and tried to boot again.

Again I saw the "degraded array" but I was hesitant to rebuild within BIOS until I got a good boot...which I did, and the server is working normally now, but with no redundancy for the OS.

When I look in Disk Management I see two separate 34GB drives, one as C: and one as F:. Both are Healthy, one is Active, one is Boot. Here is my plan for this evening:

1. Format drive F: in Disk Management
2. Reboot and rebuild array in RAID 1 BIOS

I have 2 spare Raptors so I could replace the drives, but I will have limited time and I think this was just a glitch, not a bad drive. It's the nVidia RAID controller with no Windows utility. The crash occurred about 1 hour after last night's SBSbackup onto the IDE drive, which is Disk0. I am eager to hear if anyone else has any thoughts/suggestions before I execute my plan.

Tony

 
Do a full ASR backup before you do anything else.

I hope you find this post helpful.

Regards,

Mark

Check out my scripting solutions at
Work SMARTER not HARDER. The Spider's Parlor's Admin Script Pack is a collection of Administrative scripts designed to make IT Administration easier! Save time, get more work done, get the Admin Script Pack.
 
Thanks Mark. I will start that now.

Tony
 
When I went to run ASR, I received the error:

"The files for the recovery diskette could not be created. The operation was aborted."[shadessad]

 
Hopefully you have a good backup. Do a regular backup of the System State info and anything else you can, then attempt to rebuild the mirror. As soon as you have things running again work on getting an ASR.

I hope you find this post helpful.

Regards,

Mark

Check out my scripting solutions at
Work SMARTER not HARDER. The Spider's Parlor's Admin Script Pack is a collection of Administrative scripts designed to make IT Administration easier! Save time, get more work done, get the Admin Script Pack.
 
Mark,

I just got out of a late meeting, it's 8:30 pm here. I am running a data array & Exchange data backup now, had to change the drive assignment on my third-party app. I'll let SBSBackup run tonight and try to rebuild tomorrow morning. All the data & Exchange store is on a separate array, I have an ASR from last Friday, as well as SBS backups from 6/12 & 6/13. I run a separate backup routine on the data array and yet another on the Exchange store contents, so I should be OK. I have a multi-pronged backup plan.

Worst case I reinstall the OS on the two new drives, then recover from the last good SBS Backup, then update the data array from the separate app if SBSBackup lets me down, it should be OK. That's why we run all these backups, right?

I really appreciate your support. It reduces the stress to have someone else on board.

Tony
 
Wow, if you have an ASR from Friday you should be good. Do you do a lot of major patching or severe changes to AD that you have one so recent?

Typically I would just advise customers to update their ASR after Service Pack installs or after making big changes to AD. Regular backups are sufficient in between typically. In your case this is super handy so your dilligence paid off.

I hope you find this post helpful.

Regards,

Mark

Check out my scripting solutions at
Work SMARTER not HARDER. The Spider's Parlor's Admin Script Pack is a collection of Administrative scripts designed to make IT Administration easier! Save time, get more work done, get the Admin Script Pack.
 
markdmac,

Weekly ASR is part of my disaster recovery plan, it only takes a few minutes to get started, that media goes home with me, and no, I don't change anything regarding the AD, except rarely adding/removing users. I'm a part-time sysadmin and quite paranoid about spending too much time recovering in the event of fire/theft/system disaster.

I got a good SBS backup last night so I have postponed my RAID 1 rebuilding until this evening.

Question: Could one night of high ambient temp & humidity on a dusty machine cause a hardware glitch like this? The room that the server sits in has its own wall A/C unit, which died. That's not too bad, as the double-wide trailer/office still has a chugging central A/C, BUT...some geniuses decided to remove the A/C for repair and left a gaping hole in the wall. It is 94 degrees here in Texas w/ 80% humidity. I patched it as best I could with cardboard and tape but I'm sure the server virtually spent the night outside. The next night this happened.

Thanks for your input.

Tony
 
With ambient temperature OUTSIDE the case being at 94 then I would say yes, most certainly. Inside the case would have overheated. Fans are supposed to pull cool air into the case. If bringing hot air in, then the case probably got to be at least 120 degrees.

You would need to check the specs on your CPU, but I would bet that is past the point of when it goes into over temp shutdown.

I hope you find this post helpful.

Regards,

Mark

Check out my scripting solutions at
Work SMARTER not HARDER. The Spider's Parlor's Admin Script Pack is a collection of Administrative scripts designed to make IT Administration easier! Save time, get more work done, get the Admin Script Pack.
 
UPDATE: RAID 1 not only refused to rebuild it took the good drive with it during the rebuild. Server locked, hard reset, both drives now **ERROR** in nVRAID BIOS. Lesson learned, I replaced both drives, recovered from ASR but only recovered System State and C: drive, the Exchange Store and data resides on an independent 3ware array.

So everything is restored, I just don't have Exchange talking to its store. I'm going to try again w/ SBSBackup and see how that goes. Glad I waited until the weekend!!!

Tony
 
Its also good that you have recent backups. So often I hear of people in your situation with very old backups.

I hope you find this post helpful.

Regards,

Mark

Check out my scripting solutions at
Work SMARTER not HARDER. The Spider's Parlor's Admin Script Pack is a collection of Administrative scripts designed to make IT Administration easier! Save time, get more work done, get the Admin Script Pack.
 
SUCCESS!!! SBSBackup came through, very simple and straightforward procedure. This time I recovered all drives and after pulling in the last backup I had made (after everyone had left yesterday) all functionality is restored.

markdmac said:
So often I hear of people in your situation with very old backups.

Well that's where my inexperience is a plus. I am overly-cautious and have little faith in hardware. Now I feel I should buy a spare mainboard, while they are still available, in the event of theft or MB failure. I did not have 2 spare raptors sitting around just for luck...it was for peace of mind...and I will be returning the two I removed for Warranty replacement. WD's excellent about that.

To anyone who finds this post trying to recover from SBS backup, it is critical that you install ALL your RAID controllers and re-assign ALL your custom drive letters before starting recovery from backup.

Thanks Mark for "being there" I know it sounds cheesy but I felt better not being totally alone in this harrowing process.

Tony

 
Always happy to help. Glad your restore worked out.



I hope you find this post helpful.

Regards,

Mark

Check out my scripting solutions at
Work SMARTER not HARDER. The Spider's Parlor's Admin Script Pack is a collection of Administrative scripts designed to make IT Administration easier! Save time, get more work done, get the Admin Script Pack.
 
Thanks ShackDaddy. For the record, my disaster plan is:

1. IDE onboard 160GB Hard drive, partitioned (R:) ASR and (B:) Backup. SBSBackup runs nightly to B:,
2. External USB drives (2) that run incremental backup (SyncBackSE) of L:RAID5 data array & exchange store nightly
3. Weekly swaps of above USB drives; one stays with me
4. At end of day Friday or usually Saturday, run an ASR, alternating weekly between resident R: IDE drive and a third USB drive.

I will change the ASR to include the data array (adds time to the run, but so what?). SBSBackup really impressed me as a reliable tool. Kudos to Microsoft, they got this one right!

Tony
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top