ChrisEsser
IS-IT--Management
I had a problem last night that may or may not be associated with backups, but it's odd. I'd like to know if anyone has seen this before.
Server:
Sun 6500 Solaris 8
Networker 6.1.2
JB is an STK L700 w/9840 drives in a tape SAN using DDS
Client:
SUN 6800 Solaris 8
Networker 6.1.2
Oracle Module 3.5
Oracle ver. 8.1.7.4
SAP brtools 6.10 patch 56
Database is > 2TB
32 channels for RMAN backup
Problem:
The system was running fine. RMAN was newly configured 48 hours earlier and a backup was done with 24 channels configured without any problems. The channels were increased to 32. A backup was started and 15 minutes later the system started to become unresponsive. The CPUs were using 1% user time and nearly 99% kernal time. The RMAN backup was stopped within Oracle and the problem continued. The nfsd was using an inordinate amount of CPU so it was killed and restarted. The system immediately came back and worked fine.
From this we concluded that the problem was caused by nfsd so we let the system run for a while to make sure things were OK and then restarted the backup. Within 10 minutes the problem started again.
Since then we haven't run any backups and have had no problems. I'm not sure the backups caused the problems, but it's too much of a coincidence to ignore. Anyone ever see this before?
Server:
Sun 6500 Solaris 8
Networker 6.1.2
JB is an STK L700 w/9840 drives in a tape SAN using DDS
Client:
SUN 6800 Solaris 8
Networker 6.1.2
Oracle Module 3.5
Oracle ver. 8.1.7.4
SAP brtools 6.10 patch 56
Database is > 2TB
32 channels for RMAN backup
Problem:
The system was running fine. RMAN was newly configured 48 hours earlier and a backup was done with 24 channels configured without any problems. The channels were increased to 32. A backup was started and 15 minutes later the system started to become unresponsive. The CPUs were using 1% user time and nearly 99% kernal time. The RMAN backup was stopped within Oracle and the problem continued. The nfsd was using an inordinate amount of CPU so it was killed and restarted. The system immediately came back and worked fine.
From this we concluded that the problem was caused by nfsd so we let the system run for a while to make sure things were OK and then restarted the backup. Within 10 minutes the problem started again.
Since then we haven't run any backups and have had no problems. I'm not sure the backups caused the problems, but it's too much of a coincidence to ignore. Anyone ever see this before?