Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

What is making my SCO box reboot on Saturdays?

Status
Not open for further replies.

jlightner

Technical User
Dec 27, 2005
2
0
0
US
Hi all,

We have a legacy system that runs SCO 3.2v5.0.4 (SCO OpenServer Release 5). Every Saturday around 5 AM (ET) it is rebooting. This is occurring during the middle of a backup that is occurring at the time. (The backup resumes after the reboot and completes successfully.)

Examining cron there appears to be no entry doing this. I did see entries for "cronsched" but after looking on the web I gather these don't do anything other than tell it to reread calendar files which I couldn't locate so presume do not exist.

There doesn't seem to be any sign of system panics and it seems unlikely those would occur on a regular schedule in any event. (The backup is running for quite a while before this reboot so doesn't seem a likely cause.)

Is there something other than cron that would cause a server to reboot regularly like this?

Would the cpuonoff command cause the system to show as if it had been booted? I did find one cron script that runs that during the backup but my read of its man page doesn't make it seem likely it would do this.
 
(The backup resumes after the reboot and completes successfully.)

Which backup utility are you using? If the system truly reboots, there will be several entries in /usr/adm/messages.
If the system is running on a UPS, make sure it isn't configured to reboot (or run some kind of test) at that time.
 
To eliminate that cron job as a possible cause why not stick some echo before/after cpuonoff >> /somelogfile commands around the cpuonoff and see whether it gets to the end of that script.

Annihilannic.
 
Sounds utility related since it is doing an 'automatic re-boot' and is then able to restart and complete the backup, and then ONLY on Saturday . You need to check the backup utility itself. Chances are there is something there (either in the shell that calls the backup, or in the backup software itself) that forces the re-boot.

JP
 
If you have the machine plugged into an APC UPS, move it to another UPS or plug it straight into the wall power. I had a similar situation in which for months I was going crazy trying to figure out what was wrong. It turned out to be the UPS. The UPS would run its weekly self-test and momentarily cut off the power to the UNIX box.

Good Luck!
-Jeff
 

This may sound dumb and non-technical, but I have experienced a similar problem and feel it worth mentioning because it took us AGES to diagnose this one, especially since you say it's happening 'around' 5am...
You don't have, for example, a new cleaner coming in and unplugging the server to plug a vacuum cleaner in do you?
(don't laugh - it happens!)

Good luck,
T
 
Not silly. I worked at a place once where they had for some reason put an outlet for the protected circuit in the hall outside the room where the main system was. Every night around 3 AM the system would go down because the breaker got tripped. Eventually we found that the night cleaner was running his floor buffer on that outlet. (Not only that once we figured it out we finally had to go to the guy's boss to threaten to fire him as he would do it every night even though the his buffer wouldn't work when the breaker tripped either - some people don't learn.)

Anyway that's not the issue here. The system is in a data center along with dozens of other machines none of which experience this.

There is no APC UPS on it either though the upsd daemon is running. It simply comes up after the reboot and says it can't communicate with the UPS.

The backup utility is Netbackup. There is nothing Netbackup is doing to reboot the server. (In fact the OS policy doesn't exhibit this problem - only the DB policy for Ingres.)

It seems to me it must be the script that is being run to kick off the Netbackup as it does other things such as shutting down the Ingres database and cpuonoff to stop then start the second cpu. The syslog and messages show the boot up info and even a couple of messages about killing processes before that leading me to believe it is an ordered shutdown rather than a panic.

Despite all that I can't see anything actually requesting a shutdown. Since the above cpuonoff is disabling then reenabling the second CPU (it implies that's where Ingres runs) I could understand a panic if there were a problem with the first CPU but the documentation clearly indicates one can't disable the only CPU running.

I guess the only thing to do is be up at 5 AM when it does this watching the console to see what happens.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top