Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Backups of cluster resurse cases drives to go into DOWN mode

Status
Not open for further replies.

ulsj

Technical User
Jan 23, 2002
80
SE
Hi!

I have NBU 3.4.1 with patch 2 runing on win2000 server.
My clients is win2000 sp2 and running Microsoft cluster services. I use a adic sclar 1000 with 2 AIT-2 drives.
When a try a backup of the cluster resurses sometimes the drives go inte down mode. When I "up" the drives again other backups can continue.

Any hints?
 
Are the tape drives connected to the cluster? If so, how?
Are one or both of the cluster nodes media servers?
What client are you using in the class? (cluster node, virtual server)
 
Hi!

Tanks for youre intresse in this matter.
yes my drives are connected to the cluster via a Compaq MDR
Both cluster nodes are media servers and I use the virtual servre name in the class. I have also created a storage node for the virtual media server.
 
OK, more questions...
Have you configured the tape drives as Multihosted Drives? (Shared Storage Option)
Does the MDR have one or two fibre ports? If two, 1) Have you created maps? 2) How many tape drives does Device Manager see?
 
Hi!

Yes both drives are configured as multihosted.
The mdr has only one fiber attached.
And the device manager on every media server sees the robotics and two drives.

Add
When I do a non schedule backup of cluster data this seems to work. The problem seems to be when backups overlapping and when backps are changing medias server.
 
So far, your configuration sounds OK. But, something isn't setup correctly.

It sounds like this is your problem...
Only one media server should have robot control. The tape drives on the others should be set to "robot is handled by a remote host". I always configure the master to control the robot.

If not, lets dig some more. Is one of the cluster nodes the master server? Can the master server see the tape drives? How many media servers do you have? Did you configure NBU on the cluster as a failover server?
 
It sounds like I have the same problem. We even have COMPAQ and Veritas looking into it and they each blame the other. Here's my setup and what's happening.

1-Master
5- NT media servers
2- 2000 media servers
20 or so clients

All COMPAQs, 2 ML570(NT media servers, one a PDC), 8 DL380 G1(1 is master, 1 is media), 5 DL380 G2, 2 ML370 (1 media, SMS), 7000(media), 2 5500(1 media), some 2500 and 1600s (1 media).

The master and media are HBA(64bit, 33MHz) fibre connected to a 2 port MDR which connects to 2 SSL2020 with 2 AIT drives each.

The master has no problem backing up the clients or itself. The media servers on the other hand are always bringing DOWN the drives. The HBA cards are to firmware revision 3.82a1 and the tape drives and MDR are at the latest. The drivers on most are to 4.53a7 for the HBA and the tape drives are VERITAS 4.0.1381.1.

Not all are up to the latest because we were testing after each server and we got to one of the ML570s and it started DOWNING the drives again. Before we got to that one it seemed to be working after loading the right drivers.

The errors we get in event viewer as follows:

TL8(0) [2792] Unload drive 2 (device 1) failed in io_open: The media in the drive may have changed.

TL8(0) drive 2 (device 1) is being DOWNED, status: Unable to SCSI unload drive

Check integrity of the drive, drive path, and media

One of the test ones I did came back with all the same except on the first message it had :I/O bus has been reset

Yesterday in testing we found that if two media servers tried to go at the same time the one that started second would down a drive or 2.

We have already turned off Removable Storage service. Might be stuff I'm missing but that's a start. I hope some of this helps someone to start looking in the right place. I have been working on it for almost 2 months.
 
lfloden, your description does sound the same. Since you have an MDR with two fibre ports, you could have tape drive ghosting. How many tape drives do you see in Windows Device Manager? Sounds like you should see four.

It also sounds like Share Storage Option isn't configured correctly. I'll ask the same question as above: are your media server tape drives set as multihosted and are they set to "robot is handled by a remote host".
 
They are multihosted and the robot is controlled by the master, all of the media are set up that way. Each server only sees 4 in device manager.

But when we ran the veritas driver upgrade utility on the 2000 servers it did show 8 drives in there. 4 with the new firmware and 4 with the old. I'll look into that. The NT ones didn't see the extra drives.
 
Hi!

To dankellen
I have set up the config as you describe.
The master server is the controll host and so on...

FYI I can tell you all that right now both Veritas sweden and the UK support are working on this. I have very few error msg in the loggas and the one error I got no one seems to know what it means. I question about this error has gone out to USA support. I will try to keep you up to date.
 
Veritas claims to have the patch for it. Patch 3 was released last night and I have been told it supposed to fix the problem.

I will be giving it a try today.

Lars
 
Hi!

This problem finaly got a resolution two weeks ago.
We changed the compaq MDR to a adic MDR.
I´v been told that the scalar 1000 is not supportet by CPQ MDR.

/Ulf
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top