Since most raid setups have a raid1 for the OS and a raid5 or 10 for data, the following procedures can be used as a fast OS backup/restore in less complicated AD implementations or workgroup setups, for insurance against Microsoft patch gremlins, buggy software/driver installs and registry edits gone awry. The procedures below involve raid1 recovery, with the use of an image drive, derived by removing one of the disks in a raid1 array.
This should be exercised on a lab machine, before using on a production server.
Prior to making changes you should have the raid setup documented and the configuration saved to a floppy ,and the obvious, a full backup. Personally I have never had problems with this type of restoration.
Not all raid manufactures have the same procedure. The procedures below were done on a late model Lsilogic raid adapter, the GUI is different from older adapter menus, but the functions are the same.
As of late, Dell Perc and Intel adapters are OEM Lsilogic, they behave the same..
During this procedure you will need to turn off the raid adapter alarm from within the raid bios setup console or through a Windows raid software interface (frequently, or live with the noise).
For these procedures you MUST know which physical disks attached to your raid array are involved in your raid 1 (channels and disk IDs) and which slots or cable connections they reside on. The raid bios console will give you the channels and disk IDs. The location of the raid1 disks in a drive rack can be obtained by going into the raid bios console, with the knowledge of the involved disk’s channel and disk IDs, choosing the physical drive menu, and offlining one of the raid1 disks. The disk will go into an alarm state, the LED associated with this will appear on the disk slot. Immediately go back into the physical disk’s properties, and place it online, do the same for the second disk of the array.
As referenced below, the primary disk is the first disk of a raid 1. The secondary disk is the remaining disk in the array. Technically a hardware raid 1 does not have a primary or secondary disk, but it is easier to explain this way. The procedures below could be streamline, with perhaps less reboots between placing the drives offline and rebuilding, but I documented the procedures for safety, not for speed; as is the procedures do not take long with a fairly fast server, under 30 minutes to complete, including the time for the drive rebuild.
In the following procedures, any disk pulled from a slot, or removed from a cable connection must be replaced to it original location.…to repeat… if you remove a disk, put it back EXACTLY where you found it or you will have problems. Always best to label your disks, slots and cables . Never offline/fail or “pull”a drive during bootup or within Windows, open files will exist, but basically it is no worse than an illegal shutdown or power loss.
Procedure if no problems occur after MS patches, software/driver installs, registry changes etc.
1) Turn off the server, pull the secondary raid1 disk from the drive chassis, or disconnect it from the cable; you now have a image backup. On startup the alarm will sound.
2) Start the server, patch the server to kingdom come, do the major program install, or modify the registry.
3) Restart the server and verify all is well. Check the event logs, run all major programs.
4) After you are satisfied no problems exist , shut down the server, and place the secondary drive back in the drive chassis or attach it to the cable.
5) Turn on the server , go into the raid bios setup, to the “physical drive” menu, select the secondary drive’s properties, then select rebuild; the raid1 will be “degraded “ until the rebuild is complete. At this point, the server will again need to be warm rebooted or shutdown/ restarted.
6) The server will come up, the secondary drive will be rebuilding; the time required dependant upon the disk size and raid adapter speed; my server takes 10 minutes with 36 Gig 15k drives. Once the rebuild is started the server can be shutdown at will, at anytime; the rebuild progress will continue again, from where it left off.
Procedure if problems occur …
You reboot after patching, a program install, or registry changes,… your server now has major problems , never logs in, or the BSOD comes up, “last known good…” offers no hope.
1) Shut the nasty beast down. Place the “pulled” secondary disk in the chassis or attach it to the cable.
2) Start the server, immediately go into the raid bios setup. Under NO circumstances allow the server to go beyond the raid bios screen.
3) Go to the physical drive menu, go to the properties of the primary disk, OFFLINE or FAIL the PRIMARY disk, it is imperative to fail this disk before proceeding..
4) Go to the secondary disk’s properties and select rebuild.
5) Exit from the setup and restart the server. At this point you can either go into the raid bios setup console and choose the properties of the primary disk, and start a rebuild , or you can let the server boot into Windows, and start the rebuild of the primary disk from a Windows raid program interface ( PowerConsole, Open Manage etc.).
At this point your OS is restored before any changes were made. Shut off the alarm
A variation of this procedure….
This is a bit safer as there is no chance of the primary drive being used to rebuild the secondary drive.
1)Basically the same as the above. Start the server with the “pulled“ secondary drive still removed from it’s slot or disconnected from the cable, go into the raid bios setup console, choose the physical drive menu, and offline/fail the primary drive.
2)At this point both drives in the raid1 are failed. Power off the server, “pull” the primary disk, reinstall the secondary drive, restart the server, go into the raid setup, go to the properties of the secondary drive, and place it online.
3) Restart the server, let it boot into Windows. Shut down the server, replace the primary disk, go into the raid setup, to the primary disk’s properties, choose rebuild, exit the setup and restart the server.
For the greatest safety with the OS on raid1, the array should have a hotspare, which give a large safety margin in case of a drive failure or as in the procedures above. With a hotspare enabled on the raid 1, you could have two copies of the OS… if the secondary is manually offlined/failed as a backup, the hotspare would automatically kick in and rebuild. Upon the rebuild completion, the hotspare could be offlined as a second spare copy. With a hotspare on line, you would need to “pull” both the secondary and the hotspare to accomplish the above procedures or pull the secondary and disable the hotspare. Cheap insurance for an FSMO or critical server.
HPs paper raid 1 recovery
Windows It PRO…
Windows OS software mirror…
Been a while since I used OS based raid or recovery, and I always used separate SCSI/IDE channels for each disk, same SCSI ID for both disks.
If a mirror involves disks with different SCSI IDs (on a cable), or the mirrored disks are on the same SCSI or raid channel (on a cable) or the mirror disks involves a master/slave setup, the following would be more involved , as in needing a boot floppy or boot.ini modifications.
Pretty much the same basic protection can be obtained with Windows mirroring as hardware raid .
Procedure, in it’s simplest form….
Before patching or major server changes, “break” the mirror raid in the Disk Management console. Once the mirror is broken, copy the needed boot files from the primary disk (boot disk) to the secondary disk.
Do your patching , software install etc. Restart the server, make sure everything is working correctly, then recreate the mirror again if all is well..
Should problems arise with the patches etc., and you need to reverse the changes …. Shut down the server, and pull both disks, place the secondary disk in the primary’s chassis slot or on it’s cable connector, then the reverse for the primary disk. Restart the server, re-mirror the disks.
As a useless note… in a really raid emergency, I use ear plug, and let the alarm rattle on, which stops EVERYONE from asking the inane questions ”What wrong with the server ?” or “When is the server going to be up ?”. Also, it helps me resist answering with “How the F* would I know, with all these constant interruptions”, or other similar socially unacceptable statements.
EVERYONE is defined as the entire office staff plus the US postal mail carrier, the UPS delivery guy, and John, from local
........................................
Chernobyl disaster..a must see pictorial
This should be exercised on a lab machine, before using on a production server.
Prior to making changes you should have the raid setup documented and the configuration saved to a floppy ,and the obvious, a full backup. Personally I have never had problems with this type of restoration.
Not all raid manufactures have the same procedure. The procedures below were done on a late model Lsilogic raid adapter, the GUI is different from older adapter menus, but the functions are the same.
As of late, Dell Perc and Intel adapters are OEM Lsilogic, they behave the same..
During this procedure you will need to turn off the raid adapter alarm from within the raid bios setup console or through a Windows raid software interface (frequently, or live with the noise).
For these procedures you MUST know which physical disks attached to your raid array are involved in your raid 1 (channels and disk IDs) and which slots or cable connections they reside on. The raid bios console will give you the channels and disk IDs. The location of the raid1 disks in a drive rack can be obtained by going into the raid bios console, with the knowledge of the involved disk’s channel and disk IDs, choosing the physical drive menu, and offlining one of the raid1 disks. The disk will go into an alarm state, the LED associated with this will appear on the disk slot. Immediately go back into the physical disk’s properties, and place it online, do the same for the second disk of the array.
As referenced below, the primary disk is the first disk of a raid 1. The secondary disk is the remaining disk in the array. Technically a hardware raid 1 does not have a primary or secondary disk, but it is easier to explain this way. The procedures below could be streamline, with perhaps less reboots between placing the drives offline and rebuilding, but I documented the procedures for safety, not for speed; as is the procedures do not take long with a fairly fast server, under 30 minutes to complete, including the time for the drive rebuild.
In the following procedures, any disk pulled from a slot, or removed from a cable connection must be replaced to it original location.…to repeat… if you remove a disk, put it back EXACTLY where you found it or you will have problems. Always best to label your disks, slots and cables . Never offline/fail or “pull”a drive during bootup or within Windows, open files will exist, but basically it is no worse than an illegal shutdown or power loss.
Procedure if no problems occur after MS patches, software/driver installs, registry changes etc.
1) Turn off the server, pull the secondary raid1 disk from the drive chassis, or disconnect it from the cable; you now have a image backup. On startup the alarm will sound.
2) Start the server, patch the server to kingdom come, do the major program install, or modify the registry.
3) Restart the server and verify all is well. Check the event logs, run all major programs.
4) After you are satisfied no problems exist , shut down the server, and place the secondary drive back in the drive chassis or attach it to the cable.
5) Turn on the server , go into the raid bios setup, to the “physical drive” menu, select the secondary drive’s properties, then select rebuild; the raid1 will be “degraded “ until the rebuild is complete. At this point, the server will again need to be warm rebooted or shutdown/ restarted.
6) The server will come up, the secondary drive will be rebuilding; the time required dependant upon the disk size and raid adapter speed; my server takes 10 minutes with 36 Gig 15k drives. Once the rebuild is started the server can be shutdown at will, at anytime; the rebuild progress will continue again, from where it left off.
Procedure if problems occur …
You reboot after patching, a program install, or registry changes,… your server now has major problems , never logs in, or the BSOD comes up, “last known good…” offers no hope.
1) Shut the nasty beast down. Place the “pulled” secondary disk in the chassis or attach it to the cable.
2) Start the server, immediately go into the raid bios setup. Under NO circumstances allow the server to go beyond the raid bios screen.
3) Go to the physical drive menu, go to the properties of the primary disk, OFFLINE or FAIL the PRIMARY disk, it is imperative to fail this disk before proceeding..
4) Go to the secondary disk’s properties and select rebuild.
5) Exit from the setup and restart the server. At this point you can either go into the raid bios setup console and choose the properties of the primary disk, and start a rebuild , or you can let the server boot into Windows, and start the rebuild of the primary disk from a Windows raid program interface ( PowerConsole, Open Manage etc.).
At this point your OS is restored before any changes were made. Shut off the alarm
A variation of this procedure….
This is a bit safer as there is no chance of the primary drive being used to rebuild the secondary drive.
1)Basically the same as the above. Start the server with the “pulled“ secondary drive still removed from it’s slot or disconnected from the cable, go into the raid bios setup console, choose the physical drive menu, and offline/fail the primary drive.
2)At this point both drives in the raid1 are failed. Power off the server, “pull” the primary disk, reinstall the secondary drive, restart the server, go into the raid setup, go to the properties of the secondary drive, and place it online.
3) Restart the server, let it boot into Windows. Shut down the server, replace the primary disk, go into the raid setup, to the primary disk’s properties, choose rebuild, exit the setup and restart the server.
For the greatest safety with the OS on raid1, the array should have a hotspare, which give a large safety margin in case of a drive failure or as in the procedures above. With a hotspare enabled on the raid 1, you could have two copies of the OS… if the secondary is manually offlined/failed as a backup, the hotspare would automatically kick in and rebuild. Upon the rebuild completion, the hotspare could be offlined as a second spare copy. With a hotspare on line, you would need to “pull” both the secondary and the hotspare to accomplish the above procedures or pull the secondary and disable the hotspare. Cheap insurance for an FSMO or critical server.
HPs paper raid 1 recovery
Windows It PRO…
Windows OS software mirror…
Been a while since I used OS based raid or recovery, and I always used separate SCSI/IDE channels for each disk, same SCSI ID for both disks.
If a mirror involves disks with different SCSI IDs (on a cable), or the mirrored disks are on the same SCSI or raid channel (on a cable) or the mirror disks involves a master/slave setup, the following would be more involved , as in needing a boot floppy or boot.ini modifications.
Pretty much the same basic protection can be obtained with Windows mirroring as hardware raid .
Procedure, in it’s simplest form….
Before patching or major server changes, “break” the mirror raid in the Disk Management console. Once the mirror is broken, copy the needed boot files from the primary disk (boot disk) to the secondary disk.
Do your patching , software install etc. Restart the server, make sure everything is working correctly, then recreate the mirror again if all is well..
Should problems arise with the patches etc., and you need to reverse the changes …. Shut down the server, and pull both disks, place the secondary disk in the primary’s chassis slot or on it’s cable connector, then the reverse for the primary disk. Restart the server, re-mirror the disks.
As a useless note… in a really raid emergency, I use ear plug, and let the alarm rattle on, which stops EVERYONE from asking the inane questions ”What wrong with the server ?” or “When is the server going to be up ?”. Also, it helps me resist answering with “How the F* would I know, with all these constant interruptions”, or other similar socially unacceptable statements.
EVERYONE is defined as the entire office staff plus the US postal mail carrier, the UPS delivery guy, and John, from local
........................................
Chernobyl disaster..a must see pictorial