Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

When tape is marked as full - block size is 32768 not 262144 2

Status
Not open for further replies.

PeteRock

Technical User
Aug 21, 2002
1
0
0
GB
08/20/02 03:10:47 nsrd: media notice: 9840 tape CW1024 on /dev/rmt/0cbn is full
08/20/02 03:10:47 nsrd: media notice: 9840 tape CW1024 used 21 GB of 20 GB capacity
08/20/02 03:11:01 nsrd: media notice: Volume "CW1024" on device "/dev/rmt/0cbn": configuration. Tape positioning by record is disabled.
08/20/02 03:12:53 nsrd: media warning: /dev/rmt/0cbn reading: fsr 57 read: I/O errorBlock size is 32768 bytes not 262144 bytes. Verify the device
08/20/02 03:12:54 nsrd: media emergency: could not position CW1024 to file 34, record 58
08/20/02 03:15:00 nsrd: media warning: /dev/rmt/0cbn reading: fsr 57 read: I/O error
08/20/02 03:15:00 nsrd: media emergency: could not position CW1024 to file 34, record 58
08/20/02 03:17:06 nsrd: media warning: /dev/rmt/0cbn reading: fsr 57 read: I/O error
08/20/02 03:17:06 nsrd: media emergency: could not position CW1024 to file 34, record 58
08/20/02 03:17:06 nsrd: media warning: /dev/rmt/0cbn moving: fsf 34: I/O error
08/20/02 03:19:07 nsrd: media warning: /dev/rmt/0cbn reading: fsr 57 read: I/O error
08/20/02 03:19:07 nsrd: media emergency: could not position CW1024 to file 34, record 58
08/20/02 03:19:07 nsrd: media warning: verification of volume "CW1024", volid 1633454593 failed, can not read record 58 of file 34 on 9840 tape CW1024
08/20/02 03:19:07 nsrd: media notice: verification of volume "CW1024", volid 1633454593 failed, volume is being marked as full.
08/20/02 03:19:07 nsrd: media notice: Save set (1636895489) s815038:C:\ volume CW1024 on /dev/rmt/0cbn is being terminated because: Media verification failed
08/20/02 03:19:07 nsrd: s815038:C:\ done saving to pool 'CMIG' (CW1024) 300 MB
08/20/02 03:19:07 nsrd: write completion notice: Writing to volume CW1024 complete
 
What type of tape drive? I've been having some similar issues with an HP Ultrium, and still playing with st.conf file settings to see if I can get this cleaned up, since a quick google search lead me to check on this.
 
Anybody have solution for this problem? I also having same problem.
 
It could be a 'simple' media problem. NW marks as media as full if it has reached the max. retry count (20 by default).
 
But my media only used 5727 MB of 100 GB capacity and already mark full.
I tried few media, all having same problem.
Any idea? Thanks.
 
Of course from the distance, i cannot determine, whether this is a drive or a media proble. Make sure that the st.conf entries for the drives are correct:

"STK 9840", "STK 9840 1/2\" LINEAR 20G", "STK_9840",

STK_9840 = 1,0x24,0,0x1de39, 1, 0x00,0;


However, you problem looks like the drive detected the wrong block size which should not change once the data area has been written. The reason could also be the tape drive firmware. NW uses 32kB only for the label. However, such data blocks do appear here. Could it be that the media has been used by another application already ?

May i suggest you erase the media and try using it again ?
To ensure that it is not a NW problem you may also use other (OS native) tools to check whether you can use your
configuration properly.
 
WINDOWS-Blocksize-HOWTO

Sample:
C:\>nsrjb -lnv -f \\.\Tape0 -S Slot_Number
C:\> scanner -vvv \\.\Tape0
scanner: Opened \\.\Tape0 for read
scanner: Rewinding...
scanner: Rewinding done
scanner: Reading the label...
scanner: Reading the label done
scanner: scanning dlt7000 tape Full.001 on \\.\Tape0
scanner: volume id 128186625 record size 65536
created 11/30/01 10:21:29 expires 11/30/03 10:21:29
...

CHANGING THE BLOCK SIZE:

1. Run REGEDIT (the Registry Editor)
2. For non-Ultra2 host adapters, open:
\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\aic78xx


Note: You can also get the driver information in Control Panel.

3. Create a key named Parameters (if it doesn't exist) by selecting from the menu:
Edit > New > Key and entering Parameters for Key Name.

4. Open the Parameters key and create a key named Device by selecting from the menu:
Edit > New > Key and entering Device for Key Name.

5. Open the Device key and create a DWORD named MaximumSGList by selecting from the menu:
Edit > New > DWORD and entering MaximumSGList for DWORD Name.

6. Open the MaximumSGList Value Name and replace the existing Value Data in Decimal with one of the values from (Table 6)

Table 6. Block Sizes
Block Size Decimal Value Hex Value
64k 17 decimal 11 hex
96k 25 decimal 19 hex
128k 33 decimal 21 hex
256k 65 decimal 41 hex
512k 129 decimal 81 hex

7. Specify the size of the writing block size:
By setting the environment variable of the default block size used by NetWorker to a value between the default and the maximum block size (Sample 8.2 below)
For a list of block size for different devices call Techinical Support or go to legato10382
This can be done by openning "System" from "Control Panel" and add the env. var. in the "System Variables" portion under the "Advanced" Tab.
NetWorker will use this env. var. value to write to tapes.

Sample 7.1
NSR_DEV_BLOCK_SIZE_DLT7000=64
in this case the tapes are going to be written at 64KB and the system can read any tape written in these values (64, 96 and 128KB)

8. After rebooting the machine
Run the inquire command to locate your drives, and run the mt command to check if the maximum blocksize has changed.

Note: in Sample 8.2, the block size has been changed to 128Kb, this works out
by dividing 131072 by 1024, and the same for (65536/1024=64Kb)

Sample 8.1
in this example the key is:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\aic78xx\Parameters\Device
\_ MaximumSGList = 33 decimal

C:\>inquire
scsidev@0.5.0:NEC CD-ROM DRIVE:4661.06|CD-ROM
scsidev@0.6.0:ARCHIVE Python 06408-XXX8071|Tape, \\.\Tape0
scsidev@1.0.0:HP C5173-7000 3.02|Autochanger (Jukebox)
scsidev@1.2.0:QUANTUM DLT7000 1732|Tape, \\.\Tape1
scsidev@1.3.0:QUANTUM DLT7000 1732|Tape, \\.\Tape2
scsidev@2.0.0:SEAGATE ST39204LC 0002|Disk, \\.\PHYSICALDRIVE0
scsidev@2.1.0:QUANTUM ATLAS V 9 SCA 0201|Disk, \\.\PHYSICALDRIVE1
scsidev@2.6.0:DELL 1x4 U2W SCSI BP 5.35|Processor

Sample 8.2
C:\>mt -f \\.\Tape1 status
\\.\Tape1:
Media Capacity = 15.20GByte
Media Remaining = 13.60GByte
Media Blocksize = 0
Media Partition Count = 0
Media is not write protected
default blocksize = 65536
maximum blocksize = 131072
minimum blocksize = 1
MaximumPartitionCount = 0
Partition = 0
Logical block position = 44815
EOTWarningZoneSize = 0
CompressionEnabled
Features: ...
 
I do not think that this is a problem SETTING the block size. This is done indirectly by selecting the device. And it seems to run fine for a while.

The error reports that another block size has been found. This can only happen during a read, not a write process. In principle you may set another block size but then NW would claim the same problem from the very beginning.
 
PeteRock:

The log shows that the tape reached physical end. 9840 tapes have 20GB native capacity.

When NetWorker gets a signal that the physical end is reached, it attempts to rewind the tape a couple of records, then tries to verify the last bits of data written to tape before ejecting and then loading another tape to continue backing up.

(Actually NetWorker doesn't know when physical end has been reached. What happens is that it gets a signal from the device driver saying that there was an i/o error writing to the media. This results in believing that it is physical end, (even if it really isn't). This is to prevent NetWorker from continuing to write data to possibly bad media.)


The message:

nsrd: media warning: /dev/rmt/0cbn reading: fsr 57 read: I/O errorBlock size is 32768 bytes not 262144 bytes.

by itself is not necessarily bad. In the case of end of tape, this can be caused because the last bits of data that NetWorker wrote (before it hit physical end) is less than the the block size being used (in this case 256K for 9840 drives).

However the subsequent messages show that there was difficulty in that the tape drive could not position the tape as needed prior to data verification:

nsrd: media emergency: could not position CW1024 to file 34, record 58
nsrd: media warning: /dev/rmt/0cbn reading: fsr 57 read: I/O error
nsrd: media emergency: could not position CW1024 to file 34, record 58

This could be the result of a driver issue or a tape drive configuration issue.


I suggest:

- make sure that the drive has up-to-date firmware
- If you are using Solaris, update Solaris with the latest st driver patch
- If you are using Solaris, modify your /kernel/drv/st.conf with the correct settings for a 9840 drive (such as 605 has stated earlier). The best place to get the correct settings is from the drive manufacturer (in this case Storage Tek), but I'm sure you can also find it if you search the net or even Tek-Tips.

There are some tape drives that cannot position as described above. In this circumstance, there is a NetWorker flag that you can use to turn of media verification. Turning off media verificationis not usually a problem. I don't remember the flag, but if you want to try it, then let me know and I'll try to remember what it is and post it here.
 
"auto media verify" is a paramteter for the pool, not for the device.
 
Auto media verify is not what I was referring to.

Auto Media verify is a pool resource setting that tells NetWorker to periodically verify the data being written to tape while backups are being written.

Auto media verify has nothing to do with the tape verification when NetWorker thinks it's at end of physical tape. These are two separate events and processes.

If you look at the daemon log for events where the media reaches end of tape, you will see that it will try to rewind, and then verify before ejecting the tape... unless you deliberately turn this off.
 
I don't think so. Please read the following info from the nsr_pool manpages:

auto media verify (read/write, yes/no, choice)

If set to yes, NetWorker verifies data written to volumes from this pool. Data is verified by repositioning the volume to read a portion of the data previously written to the media and comparing the data read to the original data written. If the data read matches the data written, verification succeeds; otherwise it fails. Media is verified whenever a volume becomes full while saving and it is necessary to continue onto another volume, or when a volume goes idle because all save sets being written to the volume are complete. When a volume fails verification, it is marked full so NetWorker will not select the volume for future saves. The volume remains full until it is recycled or a user marks it not full. If a volume fails verification while attempting to switch volumes, all save sets writing to the volume are terminated.

 
yes? your point? what have i said that is not reiterated in your quotation from the man pages?


Perhaps the man page was written poorly. The section where it starts with:
"Media is verified whenever a volume becomes full
while saving and it is necessary to continue onto
another volume"
should have started on a separate paragraph.

If you need convincing, try it out. I have.
 
Another issue which i have totally forgotten so far:
If the drive is dynamically shared with a Windows device in a SAN, it is most likely that different block sizes are used under Windows. As this was never mentioned so far, i do not know whether it applies.


wallace88: You said that you are NOT refering to the "auto media verify" parameter. So i wonder what you have meant because what you describe refers exactly to it, as you stated as well.
 
I have the same problem once in a while on Solaris 9, Networker 6.1.3. SDLT 220 tape drives in a STORTEK L40 library.

It seems to happen when we have media or SCSI errors in the OS log. Then Networker loops......

11/28/03 09:11:07 nsrd: media event cleared: Waiting for 1 writable volumes to back
up pool 'GIS Default Clone' tape(s) on chi-backup
11/28/03 09:12:39 nsrd: chi-backup:cloning session saving to pool 'GIS Default Clon
e' (100177)
11/28/03 09:12:39 nsrd: cloning session:1 of 205 save set(s) reading from 100158 88
4 KB of 79 GB
11/28/03 09:20:53 nsrd: media notice: sdlt tape 100177 on /dev/rmt/0cbn is full
11/28/03 09:20:53 nsrd: media notice: sdlt tape 100177 used 61 GB of 100 GB capacit
y
11/28/03 09:21:08 nsrd: media notice: Volume "100177" on device "/dev/rmt/0cbn": Block size is 32768 bytes not 131072 bytes. Verify the device configuration. Tape pos
itioning by record is disabled.
11/28/03 09:25:58 nsrd: media warning: /dev/rmt/0cbn reading: fsr 10706 read: I/O e
rror
11/28/03 09:25:58 nsrd: media emergency: could not position 100177 to file 49, reco
rd 10708
11/28/03 09:25:58 nsrd: media warning: /dev/rmt/0cbn reading: I/O error
11/28/03 09:25:58 nsrd: media warning: /dev/rmt/0cbn reading: I/O error
11/28/03 09:25:58 nsrd: media warning: /dev/rmt/0cbn reading: I/O error
11/28/03 09:25:59 nsrd: media warning: /dev/rmt/0cbn reading: I/O error
11/28/03 09:25:59 nsrd: media warning: /dev/rmt/0cbn reading: I/O error
11/28/03 09:25:59 nsrd: media warning: /dev/rmt/0cbn reading: I/O error
11/28/03 09:26:01 nsrd: media warning: verification of volume "100177", volid 33198
16193 failed, can not read record 10708 of file 49 on sdlt tape 100177
11/28/03 09:26:01 nsrd: media notice: verification of volume "100177", volid 331981
6193 failed, volume is being marked as full.
11/28/03 09:26:01 nsrd: chi-backup:cloning session done saving to pool 'GIS Default
Clone' (100177)
11/28/03 09:26:01 nsrd: media emergency: Unable to receive
11/28/03 09:26:01 nsrd: media notice: Cloning of save sets to volume 100177 on /dev
/rmt/0cbn is being terminated because: Media verification failed
11/28/03 09:26:01 nsrd: write completion notice: Writing to volume 100177 complete
11/28/03 09:26:02 nsrd: media emergency: Unable to receive
11/28/03 09:26:07 nsrd: media emergency: Unable to send
11/28/03 09:26:07 nsrd: media emergency: Unable to send
11/28/03 09:26:22 nsrd: media emergency: Unable to send
11/28/03 09:26:22 nsrd: media emergency: Unable to send
11/28/03 09:26:48 nsrd: media emergency: Unable to send
11/28/03 09:26:48 nsrd: media emergency: Unable to send
11/28/03 09:27:23 nsrd: media emergency: Unable to send
 
I have the same problem.
Solaris 9 Server configured as Storge Node, Networker 6.2 W2K Server, SDLT 220 tape drives in a Quantum P4000 library.

/var/log/messages:
Jan 20 09:59:11 sun4 Error for Command: read Error Level: Fatal
Jan 20 09:59:11 sun4 scsi: [ID 107833 kern.notice] Requested Block: 0 Error Block: 0
Jan 20 09:59:11 sun4 scsi: [ID 107833 kern.notice] Vendor: QUANTUM Serial Number: . G
Jan 20 09:59:11 sun4 scsi: [ID 107833 kern.notice] Sense Key: Aborted Command
Jan 20 09:59:11 sun4 scsi: [ID 107833 kern.notice] ASC: 0x48 (initiator detected error message received), ASCQ: 0x0, FRU: 0x0
Jan 20 09:59:11 sun4 scsi: [ID 107833 kern.notice] Incorrect Length Indicator Set

The NW Daemon.log looks like that from eskolnik.

What's going wrong?

Is is the tape itself?
Two off three tapes produces failures/warnings, one is working fine.

 
Do you share the devices dynamically between UNIX/Windows?
If so, you MUST ensure that both devices have the same block sizes.
If not, a larges block size (usually used on UNIX) cannot be read by a
device which has the smaller one configured/limited.
 
Yes, the drives a dynamic shared drives between Unix/Windows.
But on the W2K Server i set the env. var. NSR_DEV_BLOCK_SIZE=128 and in the registry i added the MaximunSGList=41(h).

Everthing is working very well for a long time but since last Saturday i have trouble. The W2K Server logs this:

media warning: \\.\Tape0 writing: unknown error 1117 (0x45d), at file 2 record 82
01/20/04 11:15:21 nsrd: media notice: sdlt tape AJF834S on \\.\Tape0 is full
01/20/04 11:15:21 nsrd: media notice: sdlt tape AJF834S used 10 MB of 101 GB capacity
01/20/04 11:15:36 nsrd: media notice: Volume "AJF834S" on device "\\.\Tape0": Block size is 32768 bytes not 131072 bytes. Verify the device configuration. Tape positionin by record is disabled.

net helpmsg 1117:
the request could not performed because of an IO device error.

 
Wiyj me I am NOT sharing drives, it's all UNIX SOLARIS 9 Networker 6.1.3

Ed

Ed Skolnik
 
I had these kind of errors with a Benchmark VS640 autochanger, replaced it with an identical unit (now branded Quantum) and the errors went away.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top