Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Scan hard disk for errors?

Status
Not open for further replies.

amaru96

MIS
May 12, 2006
40
AU
Hi guys, I've got an IBM x346.

Is there a utility which will scan the hard disks in the server for any errors/bad sectors. I'm getting some windows crashes which seem to indicate a problem with a hard disk.

Any help would be appreciated.
 
If you boot into the ctrl-i and go into view controller status what is the status of the drives on the channels.

ONL means they are all online and the letter next to that indicates the raid array they are part of.
 
Hi, I only have remote access to the server when Windows is running. However, I do have ServeRAID manager installed and the status of all drives is Online.

Is there a utility which will scan through the drives for corrupt/bad sectors?
 
Before you actually go into ServerRAID, how do you know it's a disk problem? Have you checked the Windows event logs to see if there are issues with the disk (or with the server in general). If you don't see anything in the event logs and you are still concerned -- try running chkdsk first
HTH
 
The first thing I would do is pop the top cover off---there are lights (can't think of the name of the card with the lights) and all the info on the inside of the top cover that will tell you what the lights mean. This can be done while the system is running (no interlock switch). The X345 and X346 boxes sometimes need the pci riser and pci cards reseated, and the VRM's for the procs can go bad.

Burt
 
Unfortunately I don't have physical access to the server. All work must be done remotely.

Is there no utility out there which will scan the HDs for bad sectors?
 
Amaru96, as I indicated before checking h/w, start at the OS layer. You didn't indicate what OS you're using, but in Windows you would use chkdsk for this purpose (fsck in Linux). This should tell you if there any errors with the actual disk. What this doesn't do is if you are having mechanical problems/controller errors etc. You would need to check your event logs for these and of serveraid. HTH
 
What RAID config are they in? Server RAID Manager should alert you of a predictive failure, but that's hardware. A corrupt file is another. What does the minidump or Event Viewer (In Windows) say about the crashes?

Burt
 
The server is running Server2003. Every so often it will crash with an error of "Error code 00000077, parameter1 c000000e, parameter2 c000000e, parameter3 00000000, parameter4 005da000."

I did some checking on it and it's a little vague. It could be a few things but points towards a hard disk problem.

ServerRAID 7.00.15 is not reporting any problems.
 
So Event Viewer doesn't get more specific than that? Is the OS on a mirror or RAID5? If so, this sounds like a file corruption to me...

Burt
 
In other words, an OS usually won't crash because of a physical problem...USUALLY, not "never".

Burt
 
Did you review the troubleshooting info regarding this stop error;


Could be disk, boot sector virus, pagefile issues.

If the event log doesn't show any possible disk/controller error message (I assume you have ServerRAID mgmt tool for Windows installed), then check the other two first
 
I should probably elaborate a little.

The server ONLY crashes during the backup process. I'm using Veritas Backup exec and when it reaches around the 200GB mark of backed up data in crashes.

The exact error report in the event log is:
Error code 00000077, parameter1 c000000e, parameter2 c000000e, parameter3 00000001, parameter4 00a72000.

The information provided by Microsoft seems to indicate a drive problem. But still quite unclear.


Appreciate all the help/suggestions/info by everyone.
 
Amaru you need to perform a little bit of research before you can make claims that the disk is at fault. Have you;

- Checked the system event logs for events that occur before the blue screen

You may have figured out a possible cause to your own problem by stating this occurs when you are running a backup with Backup exec. A google search has indicated that there have been issues with certain backup exec versions.

Check these entries in the MS knowledgeabse

HTH
 
Hi, the event log is pretty clean until the crash.

I am using Veritas backup exec 11D rev 7170. I've noticed a new version of BackupExec is out (v12). Might try updating and see what happens.

It may also be a problem with the tape drive, which I'll need to check.
 
What kind of tape drive?
I have actually found Backup Exec 9 and 10 to work best.

Reinstalling the Veritas may work, instead of upgrading. A customer of mine had a Dell Powervault 220T and I replaced a tape drive that the leader had come off of---never worked until they uninstalled and reinstalled Veritas (actually upgraded from 8 to 9). It never could load a driver (8), but 9 works just dandy.

Burt
 
It's an LTO2 tape drive.

I reduced the amount of data that needs to be backed up to see if it made a difference and it did. I only backed up a very small portion of the server but it did work. It seems to fail at around the 200GB mark every time. I only backed up 10GB this time.

I'll try increasing it slowly and see what happens.
 
Compressed or uncompressed? Uncompressed you're only gonna get 200GB...what model?

Burt
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top