Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Networker device disabled

Status
Not open for further replies.

ntdiane

IS-IT--Management
Feb 2, 2001
8
US
We recently migrated our Legato server from NT to UNIX. We are using a quad NIC on a Solaris system. We are now getting these errors every night and the backups just hang. Only the cloning process works. Help ASAP would be greatly appreciated.

NetWorker Device Disabled: (warning) Device /dev/rmt/1cbn is automatically disabled.
consecutive errors (21) exceeded the maximum consecutive errors allowed.
Please fix the device or set a higher value for the Max consecutive errors
attribute in the device resource.
 
Sounds like you have a problem with your tape drive. Do you only have one tape drive? If you have more than one are you getting the same errors on the other drives? I would cd /etc/LGTOuscsi and run the inquire command. This will give you an idea of the scsi address of your tape drives. You can also run mt -f /dev/rmt/1cbn status to verify that your tape drive is working correctly. You can put a tape into the drive and tar a file to the drive to verify that the tape drive is working from an OS level.
From the nwadmin window you can go to media, devices and remove the drive and add it again. Let me know if you need some additional information.

BK
 
One other thing, make sure that the /kernel/drv/st.conf file is correct for type of tape drive.

Good luck!

-ag100
 
um, don't panic just yet. There is a little quirk with I/O errors and labeling tapes that you need to take a peek at. Every time you mount a blank tape to label it, the device will report an I/O error. If you repeatedly mount blank tapes to label them, your consecutive errors will jump correspondingly. Bump the consecutive errors (default is 3) to something like 30 or 40. Mount a couple blank tapes, label them, then immediately mount a labeled tape. See if your errors continue to increase. We have this problem when we try to label 100+ tapes at a time on several of our L700 libraries using legato. Mounting a known tape after a few blank tapes clears up the problem right away.
 
if you check and see that the consecutive errors for a device are too high, try and mount a tape with which you didn't have any problems and the cunter will be set to zero again. please be cautious with setting this value too high because if you have a real problem it will take forever that the drive is being disabled before legato continues again(in version 5.1 we have the value set to 5 consecutive errors).
 
Thanks for the info. I would appreciate any information on StorageTek 9710's. Are they prone to errors? We use DLT4's
 
ag100 is correct; I had this same problem when I started at my current job. st.conf was AFU due to faulty info given by Legato tech support (we have a Storage Tek 9730). If st.conf is messed up, there might be some indication in /var/adm/messages or messages spit out during boot. Here's what parts of ours looks like (most of the lines are commented out; the ones that aren't probably need a semicolon at the end for porper syntax):

#
#
tape-config-list=
#
#
"QUANTUM DLT7000", "Quantum DLT7000", "DLT7k-data";
#
#
DLT7k-data = 1,0x38,0,0x1D639,4,0x82,0x83,0x84,0x85,2;
 
It might help to know the type of SCSI controller you are using (is it differential, etc) and the exact make & model of the tape drive and/or library/autochanger.
Have seen this error several times.
I ran into this problem once because the drive was not terminated properly. (Wrong type terminator)
What kind of Solaris box? Ultra 5, E450, etc.
What version of Solaris? What version of Legato?
Typically your first tape drive is 0cbn; since the error shows 1cbn do you have more than one drive running on the same Solaris box? If so, are they both on the same controller?
More info might help someone to help you solve your problem.
HTH
lesb - SUN SysAdmin
 
Saw that once on a SUN quad card. Turned out to be the driver. We had to get a new driver for the quad card....the one shipped with the card was bogus.
 
On second thought, I think it was the driver for the scsi card.
 
I have seen this error a number of times, and there are 2 basic possibilities. The second is that the tape drive is non functioning, and this will require the jukebox be shutdown and the drive replaced. The first occurs after a backup or clone job abends, or a tape label is unreadable.
So here is what I start with:
1. Determine whether there is a tape in the drive. Depending on the model of jukebox, you many be able to see if the drive is loaded.
2. If there isn't a tape in the drive, review the logs in /nsr to determine the tape that caused the failure. I have seen this problem caused by bad tapes more than bad drives.
3. Run an nsrjb -I -E to reinventory the jukebox, the results can be viewed by running a nsrjb afterward.
The listing in the jukebox will appear at the bottom.
4. Enable the drive either from the GUI or from nsradmin.
5. If the listing show a tape in the drive, run the command
nsrjb -u -f /dev/rmt/1mbn to unload the tape. If a tape unloads, you will probably want to eject the tape into the jukebox cap to eliminate the possibility of the tape being used by another drive. Use the nsrjb -w command to move the tape into the cap.
6. Now for the dirty part. If there isn't a tape in the drive, awk through the log, keying off of the drive name, and determine the tape volume that was mounted when the drive reported it's first I/O error, also in the /logs/messages file. You may have to manually place this tape into the bad drive, and power cycle the jukebox to recover the drive [after shutting down Legato...]. This has actually recovered more drives for me than replacing the hardware.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top