Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Jobs continuing to run 2

Status
Not open for further replies.

andrewfr

MIS
May 22, 2003
66
0
0
GB
We have NetBackup 4.5 controlling a L20 robot via a Win2000 server. It backs up various servers most of which are Novell. This setup has been running for nearly two years.

I have just come back from holiday to find that the backups haven't been running at all. I have investigated and the cause of the problem is that two jobs (one in each drive) are still running in the morning which means the backup window for the other jobs closes.

The jobs have to be cancelled as the minutes to completion just keeps going up.

Can anyone help please?
 
Well it seem's you need to do the basic troubleshooting..first

Do the jobs have tapes loaded in drives?
Are the jobs actually writing data if so?
Check the device monitor are your drives up?
Is the robot functioning and running?
Does your /usr/openv file system have free space left?


This is what I would check first and should take no more than 5-10 minutes..

If all the above checks out, you'll need to review the
syslog and the other netbackup logs bprd,bptm,bpbkar..etc starting at the time the first job kicked off..

Also take a look all the "all log entries" report for any clues

Has to be something obvious and simple...

FYI is you let /usr/openv file system go 100% full the netbackup database won't be able to write anymore due to lack of space and Netbackup will stop but not shutdown and you also run the risk of database corruption if this happens although I've had /usr/openv fillup before I knew about thus and we were able to bring Netbackup up just fine..

Good luck...

Ryan
 
Hi Ryan

Yes to all your questions apart from /usr/openv file system, what and where is this? Also how do I clear it?

If I could just explain. The server and robot are on a remote site and I administer using the GUI on a workstation several miles away.

Thanks in advance for any assistance you may be able to give me.

Regards

Andrew.
 
This is a problem that some people have seen with Windows based server backups. There is no real cure apart from rebooting the servers and hoping ot does not happen again. It seems to be a weakness of Windows and veritas working togeather. We can get the same thing on unix but it's easy to fix.
 
Thanks for the replies. I have restarted the server several times now. One ut of the five jobs runs ok but the others are still running in the morning. As I said before, it has been working fine for nearly two years so why should there be problems all of a sudden?

I have even tried a fresh cable into the switch. At the weekend one of the jobs did evetually finish after 15 hours!!! This job used to take about two hours. What's going on here???
 
If other jobs are working I would rule out the /usr/openv file system space running out.

Did the servers recently increase in size?

Try re-installing the software..Windows and Veritas sometimes part ways so to speak...

Ryan


 
Hi Ryan

No they are exactly the same. One day they were working the next only one completes and the others fail. One with "System Call Failed (11)" and the rest just run for too long.

Arrrggghh!!!

I will try a reinstall of the software. Will I have to back up all the files it has created or do I only have to worry about the catalog?

Andrew.
 
For the one failing with 11 system call failed...

Is OTM or VSP enabled on this box?
If it's not, enable it
If it is already enabled make sure it is "set to retry"

Ryan
 
It was enabled but yesterday I disabled it to see if that made any difference, it didn't. I have just re-enabled it with the "set to retry" option. Not holding my breath though as nothing seems to work. :-(
 
It was enabled but yesterday I disabled it to see if that made any difference, it didn't. I have just re-enabled it with the "set to retry" option. Not holding my breath though as nothing seems to work. [sad]
 
Right. Redid the backup jobs from scratch. Still the same. [sad]
Reinstalled the server software. No luck there either. [sad] I installed over the existing installation - Repair installation - rather than uninstalling.

I'm at a loss what to do next. What files/logs get written to the backup server in a Windows installation? I'm thinking that something has reached a maximum capacity and is slowing the jobs right down. As an example, the jobs did complete over the weekend because they had enough time. One job backed up 33GB and it took 48 hours at 182Kb/Sec.

Any ideas gratefully received. :-D
 
There's a tool the Veritas techs can let you run to determine where your bottlenecks are. Seems to me that at 182KB per, you might have those NICs on the client side set to 10mbps


Ryan
 
Hi Ryan

Thanks for the reply. I knew we hadn't changed anything but just to make sure I checked and all NICs on the clients are set to 100 full duplex.

I had a look on the Veritas (Symantec) site but couldn't find any tools like the one you mentioned. Do you know the name of it and where I can obtain it please?


Andrew.
 
As long as your checking NICs best to check the routers too.

Bob Stump
Just because the VERITAS documentation states a certain thing does not make it a fact and thats the truth
 
Routers fine too Bob. It's a total mystery. One minute working fine then it all goes pear shaped without anything being done. That's why I'm wondering if some database or log is full, but I don't know what to look for.

Andrew.
 
I don't think anything is full if other backups go on working.

You need to call Veritas and ask for someone in the network group. They should know what the tool is called and be able to send it to you. You run the tool then send them the output and they run it through this certain website which will take the data and will show you any bottlenecks or issues.


Ryan
 
Hi Guys

An update for you. Yesterday I checked the jobs and noticed that they didn't have the option set for compression. I turned this on and the jobs (apart from one) ran okay last night. :-D

As the original jobs were done nearly two years ago I can't remember whether compression was set or not. Can't understand why the jobs were working then not as it would seem that compression was on and then turned off, but how?

Anyway, hopefully that will be the situation sorted. Thanks for your suggestions. :)

Andrew.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top