Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

File server W2K3 aborted due to inactivity

Status
Not open for further replies.

Iago77

IS-IT--Management
Jun 9, 2003
125
ES
Hello,

this is the layout:

FileServer (W2K3) <-> LAN <-> Legato Server (Solaris)

I have to do a full backup and there's a lot of stuff to save in the LTO3 tapes.

I've splitted the Z: save set into several ones:

R:\WINDOWS
Z:\APLICACIONES
Z:\Grabacion
Z:\grupos
Z:\usuarios
Z:\Volapli
Z:\VOLDAT1
Z:\BACKUP GC

Last night I scheduled a backup but it only copied some of them, the lighter saveset. However the heavier weren't saved. This error was showed:

* sefsh007.oepm.local:Z:\grupos 1 retry attempted
* sefsh007.oepm.local:Z:\grupos aborted due to inactivity
* sefsh007.oepm.local:Z:\usuarios 1 retry attempted
* sefsh007.oepm.local:Z:\usuarios save: RPC error: RPC send operation failed. A network connection could not be established with the host.
* sefsh007.oepm.local:Z:\BACKUP GC 1 retry attempted
* sefsh007.oepm.local:Z:\BACKUP GC aborted due to inactivity

I have tried to test the network settings, both interfaces are in Full-Duplex and 1 Gigabit. It only happens with this kind of backup. This server also has LTO-2 with storage node and it was ok, so I'm afraid the problem is not in the stream of data in the FileServer, maybe the LAN?

Many thanks in advance,

Iago
 
Hi,

this can be due to a to short Inactivity timeout. Save.exe is walking your filesystem and if the filesystem is very large, this can take lots of time. That could be why the smaller save sets works, but the larger wont. Try to increase the Inactivity timeout on the group where this client is a member and see if that helps. This is however usually a problem when running incrementals, not a full that you are doing.

Are there any firewalls between client and server that could need some tweaking?

What parallellism is set up on your client?
 
I set an ad-hoc group with a unique saveset: Z:\BACKUP GC.

I started it and it seemed it was copied ok but suddenly I received a "aborted to inactivity". In the client I found this error in daemon.log:

07/11/07 12:52:07 nsrmmd #16: RPC error: RPC send operation failed. A network connection could not be established with the host.
device resource lookup fails

It's pretty difficult to establish a workaround because with LTO2 works fine (via StorageNode). I used rpcinfo -p <LegatoServer> but I don't know how to understand this output.

LTO3 tapes: 3, target sessions: 4 per tape, client paralelism: 4. I summed all the saveset with a result of 25 saveset, is it a low number for this task?

Thanks again,
 
Hi,

since it is a nsrmmd error, is this also a storage node or was it the daemon.log on the backup server you found this message?

The rpcinfo will list the different registered programs, ie nsrd, nsrexecd, nsrmmd etc. Each has a correspondig program nr.

The parallelism should probably be fine, just wanted to know that you didn't used like 15 or something. I think you'll be fine there.

 
The error message was found in the client. Althoght it's a storage node I don't use de dedicated device. That's the reason I use the LAN to do the backup.

Thanks,
 
If for some reason the nsrmmd on the storage node can't communicate with the backup server, there is a good chance that the same problem is for nsrexecd and save programs, hence the error.

So it looks to be a network related problem after all.

It has been a long time since I did networking, but what I remember was that gigbit should not be set to full duplex/speed but auto/auto. That might not apply anymore though. Someone else can maybe fill in on that?
 
Finally the group finished with no error.

I changed the Interval expirtation according to Legato Networker Administrator manual:

---

Inactivity Timeout This attribute specifies the maximum time, in minutes, that a client is given to fail to communicate back to the server. If a client fails to respond longer than the Inactivity Timeout value, the server will consider the client as stopped responding. If a client fails due to any reason, a retry is initiated immediately. This ensures that no time is lost during the
scheduled backup due to any failures.

Note: For large save sets or for save sets with large sparse files, increase the timeout value if the backup consistently aborts due to an inactive job.

---

At the moment and I've configured the group to save the rest of the heavier saveset. I'll keep you reported.

Thanks,
 
Hi,

I am experiencing a similar issue with a host where two folders take so long there is a time out.

I tried to find the reference to interval expiration, in the Admin guide but couldnt.

Can you give a page in the guide to go to?

Thanks,

Dave
 
You better look for 'inactivity timeout'.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top