Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Tape Backup Failure

Status
Not open for further replies.

bryndog

IS-IT--Management
Jun 1, 2002
7
US
We replaced a 4/8Gb Travan Seagate SCSI tape drive on a Netware 5.0sp6 server with a 10/20 Travan Seagate STT220000N-C drive. The controller is an Adaptec 2940OU. About 95% of the time the nightly BackupExec 8.5 job fails with an uncorrectable error "reading or writing" to the tape.

Certance tech support replaced the drive with a brand new one and we still get the same error running the backup. The folks at Certance wanted us to run their TapeRX diagnostic software. (This is an NLM that you run from the console). We ran it on both drives. It failed "writing to the tape". They said that it must be a write head failure.

I was suspicious that two drives would have the same problem so I took one drive back to the shop and hooked it to my crash box on a 2940OU card. The crash box has a mobile rack that allows me to swap hard drives. With Netware 5.0sp4 on a drive, (loading NWASPI.CDM and AHA2940.HAM as the first two lines in STARTUP.NCF successfully), the diagnostic fails the same way as in my client's office.

When I replace the hard drive with one that runs Windows'98 (Microsoft Backup supplying the tape drive's drivers and the 2940 recognized by the "new hardware found" wizard), leaving all other hardware untouched, the Windows version of TapeRX runs perfectly. Therefore I conclude there is nothing wrong with the write head.

This has got to be a software problem - a Netware problem (I do not even load BackupExec on the crash box). I tried newer NWASPI and AHA2940 drivers from Netware5.1 with the same result.

I am stumped. Anyone seen this before? Any ideas of what to try next? I'm not even sure that the diags not running has anything to do with the BackupExec failures, but they are supposed to run. Thanks for reading this long post.
 
Not trying to sound sarcastic but did you try re-installing BE? I've seen weird things happen on my BE 8.5 servers that were corrected with a reinstall. Are you able to do basic tape functions like quick erase? What shows in your nightly BE log? Lastly, can you do a simple backup of a couple files or directories?
 
Thanks for trying to help. You are not at all sarcastic.

We did not try reinstalling BE because the problem appeared to be hardware related. Basic functions such as quick erase work. The backup fails after writing a variable number of bytes that ranges from 2.4 megs to 748 megs. There is a different number each night and it fails on different tapes. The exact error message from the log is:

TapeAlert!

WARNING: The operation has stopped because an error has occurred while reading or writing data which the drive cannot correct.
CRITICAL: The tape is from a faulty batch or the tape drive is faulty:
1. Use a good tape to test the drive
2. If the problem persists, call the tape drive supplier helpline

Device HA:0 ID:6 LUN:0 SEAGATE STT20000 needs to be cleaned. Please clean the device and retry the operation.

The write failure with the diagnostics is completely independant of BE and because it is a simpler environment (just the diag NLM and 2 drivers) I thought that solving this problem might help with the BE problem.
 
I've seen similar problems with various software and it doesn't necessarily mean the heads are bad. Sometimes there could be something on the driver set that is hosing up and erroring or timing out, and it is reporting as a write head failure. You might check and make sure you have newer adaptec drivers, and maybe see if the same thing happens with a different controller card in the same server. Also check for any updates to the backup software you're running.

Marvin Huffaker MCNE, CNE
Marvin Huffaker Consulting
 
Here's an update on this situation:

(By the way we have the latest drivers and tried different controller cards).

Certance has discovered that the shipping level firmware in the drive was not properly regression tested and that is why their own TapeRX diagnostic would not run under Netware.

They had us back level the firmware and now the drive can successfully run the diagnostic. However, we still get the same errors in Backup Exec. Certance said they will assign one of their senior support people to continue the investigation. I think it is still a firmware problem.
 
I have a customer with this same issue, they are using a Seagate 10/20 drive a AHA-2940 Backup Exec 8.0 and were experiencing the same backupexec errors, so far I've tried similar things: new tape, cable, controller, reload Backupexec, netware drivers, we have Netware 5 SP5. We have also tried two different drives. Have you had any luck resolving this issue?

Anthony Squire
A+/CNA/MCSA/MCSE
 
Also, this is very intermittent one day it will work the next it will not and we have tried 5 different tapes 2 brand new with the same result

Anthony Squire
A+,CNA/MCSA/MCSE
 
Anthony, your problems add to a very interesting story.

The support people at Certance have been very diligent at trying to help us resolve the problem. After trying every one of their suggestions they agreed that I should ship the drive back directly to the support engineer along with our controller card, tapes, cable and active terminator.

They tested the drive on their Netware machine and concluded that the drive, NEW OUT OF THE BOX, was indeed defective. They are in process of obtaining another drive and testing it before they ship it (and our parts) back to me.

I'm sure they will be looking into their testing procedures and performing a failure analysis on the drive I sent them.

By the way, the NEW drive was the third or fourth replacement for two or three previously replaced refurbished drives. You might want to contact Certance. You could tell them you have corresponded with someone with the same problem - I leave it to your discretion whether to reference our case #372105.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top