Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Legato backups failing all of a sudden

Status
Not open for further replies.

nyck

Technical User
Mar 10, 2004
447
GB
Hello,

My backups have been working fine for years and in the last two days they have been all failing with the below error:-

savegrp -nvp -c loncons4
loncons4:All level=9
08/28/13 08:05:43 savegrp: Run up to 32 clients in parallel
08/28/13 08:05:43 savegrp: loncons4:probe started
savefs -s lonadm2 -c loncons4 -g Default -p -n -l full -R -v
08/28/13 08:05:43 savegrp: command 'savefs -s lonadm2 -c loncons4 -g Default -p -n -l full -R -v ' for client loncons4 exited with return code 9.
08/28/13 08:05:43 savegrp: loncons4:probe succeeded.
* loncons4:All rcmd loncons4, user root: `savefs -s lonadm2 -c loncons4 -g Default -p -n -l full -R -v'
* loncons4:All permission denied
* loncons4:All 08/28/13 08:05:43 nsrexec: savefs -s lonadm2 -c loncons4 -g Default -p -n -l full -R -v
* loncons4:All Cannot connect to nsrexecd on client loncons4 and .rhosts permissions
* loncons4:All do not allow rsh.
* loncons4:All Permission denied
* loncons4:All 08/28/13 08:05:43 nsrexec: nsrexecd on loncons4 is unavailable. Using rsh instead.
--- Probe Summary ---

loncons4:All level=full, dn=-1, mx=0, vers=unknown, p=1
loncons4:All level=full, pool=Default, save as of Wed Aug 28 08:05:43 GMT+0100 AM 2
loncons4:index level=9, dn=-1, mx=0, vers=unknown, p=1
loncons4:index level=9, pool=Default, save as of Sat Aug 24 06:47:20 GMT+0100 AM 2

any suggestions as to what is going on here to cause this kinda of issue?

I have made no changes what so ever on the backup server for ages so am a bit confused as to what is going on here!

Cheers

Nick
 
I just guess, that RPC is not available on the client and that the server tries to access via rsh.
Obviously, the NW client listener froze or does not respond (correctly).

I assume that the issue can simply be fixed by restarting the NW daemon on the client.

 
Hello,

All seems to be fine with the client as a rpcinfo -p <client> from the NW server comes back ok:-

rpcinfo -p lonibm5
program vers proto port service
100000 4 udp 111 rpcbind
100000 3 udp 111 rpcbind
100000 2 udp 111 rpcbind
100000 4 tcp 111 rpcbind
100000 3 tcp 111 rpcbind
100000 2 tcp 111 rpcbind
100083 1 tcp 32773
100068 2 udp 32809
100068 3 udp 32809
100068 4 udp 32809
100068 5 udp 32809
100021 1 udp 32816 nlockmgr
100021 2 udp 32816 nlockmgr
100021 3 udp 32816 nlockmgr
100021 4 udp 32816 nlockmgr
100021 1 tcp 32775 nlockmgr
100021 2 tcp 32775 nlockmgr
100021 3 tcp 32775 nlockmgr
100021 4 tcp 32775 nlockmgr
100024 1 tcp 32776 status
100024 1 udp 32814 status
100133 1 tcp 32776
100133 1 udp 32814
200001 1 tcp 32776
200001 1 udp 32814
200001 2 tcp 32776
200001 2 udp 32814
390113 1 tcp 7937 nsrexec

Also what do you make of the below output?

savegrp -nvp -c lonibm5
override match on 8/30/2013 for level full
lonibm5:All level=full
08/30/13 08:03:01 savegrp: Run up to 32 clients in parallel
08/30/13 08:03:01 savegrp: lonibm5:probe started
savefs -s lonadm2 -c lonibm5 -g Default -p -n -l full -R -v
08/30/13 08:03:01 savegrp: command 'savefs -s lonadm2 -c lonibm5 -g Default -p -n -l full -R -v ' for client lonibm5 exited with return code 9.
08/30/13 08:03:01 savegrp: lonibm5:probe succeeded.
* lonibm5:All rcmd lonibm5, user root: `savefs -s lonadm2 -c lonibm5 -g Default -p -n -l full -R -v'
* lonibm5:All rshd: 0826-823 Cannot look up the address for your host.08/30/13 08:03:01 nsrexec: rshd: 0826-823 Cannot look up the address for your host.Permission denied
* lonibm5:All 08/30/13 08:03:01 nsrexec: nsrexecd on lonibm5 is unavailable. Using rsh instead.
--- Probe Summary ---

lonibm5:All level=full, dn=-1, mx=0, vers=unknown, p=1
lonibm5:All level=full, pool=Default, save as of Fri Aug 30 08:03:01 GMT+0100 AM 2
lonibm5:index level=full, dn=-1, mx=0, vers=unknown, p=1
lonibm5:index level=full, pool=Default, save as of Fri Aug 30 08:03:01 GMT+0100 AM 2

Now I have managed to get 80% working by adding the NW backup server into the client /etc/hosts file and also adding the client details into /etc/hosts on the NW backup server. But a few I'm still having the above issue, with regard to the server above I have rebooted it many times and no difference, I even removed the Networker client and re-installed and this made no difference.

I have noticed that some of the reverse lookups are wrong and I'm getting our IT guys to fix this but by adding everything local would get round this issue until it was fixed I would of thought!

Any suggestions on this as it's driving me up the wall at the moment :)

Cheers

Nick
 
Well, what do you think yourself?

- savefs already fails (NW cannot even build the internal worlist
- client name/IP resolutions are failing

So as long as the fundamentals are not working properly, why should NW work as expected?
Locally, i do not see anything which could be done.
You may try to manually backup from the client but depending on the version, this may also not work.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top