Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Networker 7.3.3 - Backups timing out - retrying

Status
Not open for further replies.

woodings

Technical User
Apr 25, 2003
22
GB
Currently on Networker 7.3.3 - Networker server - Solaris 10
attached via fibre to Scalar I2000 robot - 6 LT03 tape drives - Dynamic Drive sharing utilised. - 3 storage nodes - 2 are dedicated.
We have a Windows 2000 client with oracle 9 - backup job runs incrementally Mon-Sat and full Sunday - backs up via Network to Networker server/storage node, then via fibre to robot.
The F drive - (76gb incrementally, 114GB full backup) keeps timing out - eventually does backup after four retires.
Client retries is set to four, inactivity timeout set to 90
Enclosed daemon.log file
dmbcnt44:All level=incr
10/16/07 00:15:00 savegrp: Group will not limit job parallelism
10/16/07 00:15:00 nsrd: savegroup info: starting DMBCNT44 (with 1 client(s))
10/16/07 00:15:00 savegrp: dmbcnt44:probe started
10/16/07 00:15:00 savegrp: build_ss_job: savefs -s dmbcs8 -c dmbcnt44 -g DMBCNT4
4 -p -l full -R -v
10/16/07 00:15:00 nsrd: savegroup info: DMBCNT44 running on dmbcnt44
dmbcnt44:C:\ level=incr, dn=6, mx=1, vers=pools, p=4
dmbcnt44:D:\ level=incr, dn=5, mx=1, vers=pools, p=4
dmbcnt44:E:\ level=incr, dn=4, mx=1, vers=pools, p=4
dmbcnt44:F:\ level=incr, dn=3, mx=1, vers=pools, p=4
dmbcnt44:SYSTEM STATE:\ level=incr, dn=2, mx=1, vers=pools, p=4
dmbcnt44:SYSTEM DB:\ level=incr, dn=1, mx=1, vers=pools, p=4
dmbcnt44:SYSTEM FILES:\ level=incr, dn=0, mx=1, vers=pools, p=4
* dmbcnt44:All savefs dmbcnt44: succeeded.
10/16/07 00:15:03 savegrp: dmbcnt44:probe succeeded.
10/16/07 00:15:03 savegrp: dmbcnt44:SYSTEM FILES:\ started
10/16/07 00:15:03 savegrp: build_ss_job: savepnpc -s dmbcs8 -g DMBCNT44 -LL -f -
-m dmbcnt44 -t 1192317334 -l incr -W 78 -N "SYSTEM FILES:\\" "SYSTEM FILES:\\"
10/16/07 00:15:03 savegrp: dmbcnt44:SYSTEM DB:\ started
10/16/07 00:15:03 savegrp: build_ss_job: savepnpc -s dmbcs8 -g DMBCNT44 -LL -f -
-m dmbcnt44 -t 1192403738 -l incr -W 78 -N "SYSTEM DB:\\" "SYSTEM DB:\\"
10/16/07 00:15:03 savegrp: dmbcnt44:SYSTEM STATE:\ started
10/16/07 00:15:03 savegrp: build_ss_job: savepnpc -s dmbcs8 -g DMBCNT44 -LL -f -
-m dmbcnt44 -t 1192403732 -l incr -W 78 -N "SYSTEM STATE:\\" "SYSTEM STATE:\\"
10/16/07 00:15:03 savegrp: dmbcnt44:F:\ started
10/16/07 00:15:03 savegrp: build_ss_job: savepnpc -s dmbcs8 -g DMBCNT44 -LL -f -
-m dmbcnt44 -t 1192403733 -l incr -W 78 -N "F:\\" "F:\\"
10/16/07 00:15:03 savegrp: dmbcnt44:E:\ started
10/16/07 00:15:03 savegrp: build_ss_job: savepnpc -s dmbcs8 -g DMBCNT44 -LL -f -
-m dmbcnt44 -t 1192403866 -l incr -W 78 -N "E:\\" "E:\\"
10/16/07 00:15:03 savegrp: dmbcnt44:D:\ started
10/16/07 00:15:03 savegrp: build_ss_job: savepnpc -s dmbcs8 -g DMBCNT44 -LL -f -
-m dmbcnt44 -t 1192403921 -l incr -W 78 -N "D:\\" "D:\\"
10/16/07 00:15:03 savegrp: dmbcnt44:C:\ started
10/16/07 00:15:03 savegrp: build_ss_job: savepnpc -s dmbcs8 -g DMBCNT44 -LL -f -
-m dmbcnt44 -t 1192403920 -l incr -W 78 -N "C:\\" "C:\\"
10/16/07 00:15:04 nsrd: savegroup info: DMBCNT44 running on dmbcnt44
10/16/07 00:15:05 nsrd: savegroup info: DMBCNT44 running on dmbcnt44
10/16/07 00:16:27 nsrd: dmbcnt44:SYSTEM DB:\ saving to pool 'DMBCADIC2000' (NET1
60)
10/16/07 00:16:28 nsrd: dmbcnt44:F:\ saving to pool 'DMBCADIC2000' (NET160)
10/16/07 00:16:31 nsrd: dmbcnt44:SYSTEM STATE:\ saving to pool 'DMBCADIC2000' (N
ET160)
10/16/07 00:16:34 nsrd: dmbcnt44:SYSTEM DB:\ done saving to pool 'DMBCADIC2000'
(NET160) 1014 KB
10/16/07 00:16:41 nsrd: dmbcnt44:SYSTEM STATE:\ done saving to pool 'DMBCADIC200
0' (NET160) 17 MB
* dmbcnt44:SYSTEM FILES:\ A system error has occurred.
* dmbcnt44:SYSTEM FILES:\
* dmbcnt44:SYSTEM FILES:\ System error 1067 has occurred.
* dmbcnt44:SYSTEM FILES:\
* dmbcnt44:SYSTEM FILES:\ The process terminated unexpectedly.
* dmbcnt44:SYSTEM FILES:\
* dmbcnt44:SYSTEM FILES:\ A system error has occurred.
* dmbcnt44:SYSTEM FILES:\
* dmbcnt44:SYSTEM FILES:\ System error 1067 has occurred.
* dmbcnt44:SYSTEM FILES:\
* dmbcnt44:SYSTEM FILES:\ The process terminated unexpectedly.
* dmbcnt44:SYSTEM FILES:\
10/16/07 00:18:29 savegrp: dmbcnt44:SYSTEM FILES:\ succeeded.
10/16/07 00:18:29 nsrd: savegroup info: DMBCNT44 running on dmbcnt44
10/16/07 00:18:34 nsrd: savegroup info: DMBCNT44 running on dmbcnt44
10/16/07 00:18:34 savegrp: dmbcnt44:SYSTEM DB:\ succeeded.
10/16/07 00:18:39 nsrd: savegroup info: DMBCNT44 running on dmbcnt44
10/16/07 00:18:41 nsrd: dmbcnt44:E:\ saving to pool 'DMBCADIC2000' (NET160)
10/16/07 00:18:42 savegrp: dmbcnt44:SYSTEM STATE:\ succeeded.
10/16/07 00:18:42 nsrd: savegroup info: DMBCNT44 running on dmbcnt44
10/16/07 00:18:42 nsrd: dmbcnt44:D:\ saving to pool 'DMBCADIC2000' (NET160)
10/16/07 00:18:44 nsrd: savegroup info: DMBCNT44 running on dmbcnt44
10/16/07 00:18:47 nsrd: dmbcnt44:C:\ saving to pool 'DMBCADIC2000' (NET160)
10/16/07 00:23:36 nsrd: dmbcnt44:C:\ done saving to pool 'DMBCADIC2000' (NET160)
398 MB
10/16/07 00:25:36 savegrp: dmbcnt44:C:\ succeeded.
10/16/07 00:25:36 nsrd: savegroup info: DMBCNT44 running on dmbcnt44
10/16/07 00:29:45 nsrd: nndrtest:G:\ done saving to pool 'DMBCADIC2000' (NET160)
13 GB
10/16/07 00:31:45 savegrp: nndrtest:G:\ succeeded.
10/16/07 00:31:45 nsrd: savegroup info: nndr running on dmbcnt64, nndrtest
10/16/07 00:45:49 nsrexecd: GSS Legato authentication user session entry (warnin
g): "User authentication session timed out and is now invalid.". Session number
= 0:7b8, domain = , user name = root, NetWorker Instance Name = dmbcs8
10/16/07 00:45:58 nsrd: dmbcnt44:D:\ done saving to pool 'DMBCADIC2000' (NET160)
3073 MB
10/16/07 00:47:58 savegrp: dmbcnt44:D:\ succeeded.
10/16/07 00:47:58 nsrd: savegroup info: DMBCNT44 running on dmbcnt44
10/16/07 01:06:47 nsrd: dmbcnt64:D:\ done saving to pool 'DMBCADIC2000' (NET160)
14 GB
10/16/07 01:08:47 savegrp: dmbcnt64:D:\ succeeded.
10/16/07 02:44:49 nsrd: dmbcnt44:E:\ done saving to pool 'DMBCADIC2000' (NET160)
28 GB
10/16/07 02:46:50 savegrp: dmbcnt44:E:\ succeeded.
10/16/07 02:46:50 nsrd: savegroup info: DMBCNT44 running on dmbcnt44
dmbcnt44 -t 1192403733 -l incr -W 78 -N "F:\\" "F:\\"' for client dmbcnt44 exite
d with return code -1
10/16/07 04:28:13 savegrp: job (62915) host: dmbcnt44 savepoint: F:\ had ERROR i
ndication(s) at completion.
10/16/07 04:28:13 savegrp: dmbcnt44:F:\ failed.
10/16/07 04:28:13 savegrp: dmbcnt44:F:\ will retry 4 more time(s)
10/16/07 04:28:13 savegrp: dmbcnt44:F:\ started
10/16/07 04:28:13 savegrp: build_ss_job: savepnpc -s dmbcs8 -g DMBCNT44 -LL -f -
-m dmbcnt44 -t 1192403733 -l incr -W 78 -N "F:\\" "F:\\"
10/16/07 04:28:13 nsrd: savegroup info: DMBCNT44 running on dmbcnt44
10/16/07 04:28:46 nsrd: dmbcnt44:F:\ saving to pool 'DMBCADIC2000' (NET160)
10/16/07 04:39:24 nsrd: dmbcs7:/u09 done saving to pool 'DMBCADIC2000' (NET160)
5137 MB
dmbcnt44 -t 1192403733 -l incr -W 78 -N "F:\\" "F:\\"' for client dmbcnt44 exite
d with return code -1
10/16/07 04:47:25 savegrp: job (62942) host: dmbcnt44 savepoint: F:\ had ERROR i
ndication(s) at completion.
* dmbcnt44:F:\ 1 retry attempted
10/16/07 04:47:25 savegrp: dmbcnt44:F:\ failed.
10/16/07 04:47:25 savegrp: dmbcnt44:F:\ will retry 3 more time(s)
10/16/07 04:47:25 savegrp: dmbcnt44:F:\ started
10/16/07 04:47:25 savegrp: build_ss_job: savepnpc -s dmbcs8 -g DMBCNT44 -LL -f -
-m dmbcnt44 -t 1192403733 -l incr -W 78 -N "F:\\" "F:\\"
10/16/07 05:53:30 nsrd: media notice: Save set (2685662412) dmbcnt44:F:\ volume
NET160 on /dev/rmt/3cbn is being terminated because: inactivity timeout
10/16/07 05:53:30 nsrd: dmbcnt44:F:\ done saving to pool 'DMBCADIC2000' (NET160)
60 GB
10/16/07 05:53:30 nsrd: write completion notice: Writing to volume NET160 comple
te
10/16/07 05:53:30 nsrd: media info: Drive /dev/rmt/3cbn released (persistent)
10/16/07 05:53:31 nsrd: dmbcnt44:F:\ done saving to pool 'pool_none' (NET160)
10/16/07 05:53:31 nsrd: dmbcnt44:F:\ done saving to pool 'pool_none' (NET160)
10/16/07 05:53:31 nsrd: dmbcnt44:F:\ done saving to pool 'pool_none' (NET160)
10/16/07 05:53:31 nsrd: media info: restarting nsrmmd #1 on dmbcs8 in 2 minute(s
)
10/16/07 05:53:36 nsrd: media info: restart of nsrmmd #1 on dmbcs8 cancelled
10/16/07 05:55:19 nsrd: Operation 26 started : Unload jukebox device `/dev/rmt/3
cbn'.
.
10/16/07 05:55:21 nsrd: /dev/rmt/3cbn 1:Eject operation in progress
10/16/07 05:55:21 nsrd: media info: Drive /dev/rmt/3cbn reserved (persistent key
NetWorkr)
10/16/07 05:57:00 nsrd: media info: Drive /dev/rmt/3cbn released (persistent)
10/16/07 05:57:23 nsrd: [Jukebox `ADIC@5.1.7', operation # 26]. Finished with st
atus: succeeded
10/16/07 06:00:26 savegrp: command ' savepnpc -s dmbcs8 -g DMBCNT44 -LL -f - -m
dmbcnt44 -t 1192403733 -l incr -W 78 -N "F:\\" "F:\\"' for client dmbcnt44 exite
d with return code -1
10/16/07 06:00:26 savegrp: job (62943) host: dmbcnt44 savepoint: F:\ had ERROR i
ndication(s) at completion.
* dmbcnt44:F:\ 2 retries attempted
10/16/07 06:00:26 savegrp: dmbcnt44:F:\ failed.
10/16/07 06:00:26 savegrp: dmbcnt44:F:\ will retry 2 more time(s)
10/16/07 06:00:26 savegrp: dmbcnt44:F:\ started
10/16/07 06:00:26 savegrp: build_ss_job: savepnpc -s dmbcs8 -g DMBCNT44 -LL -f -
-m dmbcnt44 -t 1192403733 -l incr -W 78 -N "F:\\" "F:\\"
10/16/07 06:00:26 nsrd: savegroup info: DMBCNT44 running on dmbcnt44
10/16/07 06:00:33 nsrd: Operation 27 started : Load volume `NET160'
10/16/07 06:09:46 nsrd: dmbcnt44:F:\ saving to pool 'DMBCADIC2000' (NET160)

 
My guess is that the access rights for the F: drive are different - why else should it behave diffently?

As the manual backup is a prerequisite for the automatic one, just run the manual backup at the client from the command line with added verbosity (save -vvv...). Most likely you will receive more info which might help.
 
This drive has full access rights - the drive on this server is not local but SAN attached storage - Clariion CX500 - Im not sure what difference this would make.
 
Well - what does the manual backup report?
 
I will let you know as soon as this manual backup has been done- an oracle database is on this drive - I will need down time for this box as we do not have Networker RMAN module, thus I will have to close oracle down.
 
Apologies for the delay - the manual backup kept timing out. What we have since found is that our Netowrk card has been configured as 100mb rather than 1GB - we have reconfigured the card plus broken down our backups at folder level rather than just drive level, the improvement has been remarkable, not only have the time out errors stopped but the backups are completing much faster.
 
Good job - such problems are not easy to discover.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top