Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Net backup failing on few clients only. Please help ASAP

Status
Not open for further replies.

kris1681

IS-IT--Management
Nov 29, 2001
1
0
0
US
Hello,

My company just bought veritas net backup and a HP tape library. This is my first experience on this software.After a month long fight I got six clients going. After that i added another 27 clients. Now i am getting all sort of errors and 5 clients always fail.PLEASE HELP as the production servers are not being backed right now. This is the sequence of events and staeps done so far.

I received the unit assembled and running from my boss. I hooked the unit up in my cube for testing. I had to rebuild the server so that it was clean,just for the backups. I then reinstalled the Netbackup software.

I setup a class with about 3 clients in it and set up a production volume group called prod. After running my
first test I noticed that it would not actually back up data. It would reach a point just before writing the data and stop. I would then get an 83 or 84 error.

This was corrected by upgrading the firmware on the drives as well as the robotic unit itself. This process did not go 100% smoothly, we received an error while we were attemping the upgrade, but the firmware now states the correct
versions.

After the upgrade we then ran another tests and it would
actually begin to write data. I could then run test on 6 clients. Before I did this however I wanted to upgrade the 100 MB NIC to a 1 gig NIC. After adding the driver for the gig NIC I did a reboot. The server could no longer see
it's own hard drives. Seemed to be a controller issue. I pulled all the PCI cards just in case a conflict arose when I added the gig card with no luck.

Assuming I had lost the controller on that server I found another one to use. I put the drives from the old server into the new one and rebuilt the controllers data
from the drives. This allowed the server to come back up. I then ran a test backup of the 6 clients with the gig card plugged into a 100 MB port.

I had network timeouts or error 41 on some. I stoped that job and hooked the server to a 1 gig port so I could get a full 1 gig connection with the gig card. I reran the job
and all 6 backed up without any timeouts.

I then added the rest of the clients,approx.33. Since then everytime I run a backup I get 41 or 11 errors on some
machines,sometimes they will requeue and actually work on the second or third try. While there are five that consistently fail.

I also forgot after adding the 33 clients to the class and receiving the network timeouts and write errors I tried different settings on the gig card, a new 100 MB card, and finally what I am using now, which is an inernal 1 gig NIC.

Now I removed the five clients which always fail and did the backup, now two different machines failed.And most of them backed up at the second atempt.Yesterday night I once again tried the test without the five machines.

I ran the test once again. This time I had only one failure. It was an 11 error or "system call failed"
and it was on a completely different server from the test the other day. The other day's test had a failure on cagasefs1dgln01 and ase_dgil_db2, last night those both
backed up just fine but aim_dbsrv failed with the 11 error.

Please help. As i tried all I could with my knowledge.

So a bottomline......For the 33 clients backup 5 always failed and in the remaining rest they got backed up in the second or third attempt.Now i removed the 5 machines that always failed and backedup, now two failed. I retried the backup and now one failed.


PLEASE HELP ASAP.I APPRICIATE ALL YOUR HELP

Thanking You,
Krishna Potluri

 
Hi there

This could be a the client machine itself . some time the netbackup server having problem connect to it viua network or something . what error code that you are seeing
. Veritas has its own way of interpret the error and recommend fix for it .Here are
a simple test , if you running solaris or Hp , just use this command to test for the
port connectivity


<server > telnet remote_host bpcd . If it response with the telnet then the communcation between the client is having problem . Otherwise , the media
and volume group assign for that particular class is incorrectly setup



Loc
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top