Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Why does cron keep restarting

Status
Not open for further replies.

desbo

IS-IT--Management
Oct 24, 2002
64
GB
Every day the "/usr/bin/cron" process on our p630 AIX 5.2 restarts itself. It mostly happens at the same time of day - 11:45 and any processes that are running under the cron process just dissapear.
It is not always at 11:45, sometimes it is overnight and it doesn't always stop the child processes.
Anyone have any ideas?
 
by itself cron should not restart.

when it is restarted then most probalby it is someone's intent. maybe someone/something (script) is killing cron and new one is started (check /etc/inittab you should have respawn in cron line)
 
I am a one man technical department for this system and as far as I know there is nothing that I am doing or have written that would kill the process. Interestingly I have just looked at the other machine I support in the USA and their cron process has a start time in the last 24 hours so it must be doing the same. Perhaps it is one of my processes as you say.
 
Is there anything in the error-report at the time your cron process restarts?
 
hmm, strange thing to happen! i've never thought of tracing this as the cron process would restart by itself is it got killed!

I don't have acess to an aix machine for now as i'm on vacation but my understanding that it shouldn't be killed on a regualar basis!

why don't you monitor the processes that runs by that time? like ps aux around the time you expect the cron to be killed! you might see something unusual!

what kind of applications that runs on this box?
 
I have some info about this problem.

The cron restarted itself as described above on 19 December and did not restart again until last week. This is really surprising because as far as I know I changed nothing on the system that could have affected this and it was restarting every day previously.

It is now restarting again every day. Today it restarted 3 times at 11:15, 14:45 & 16:15. When it restarted all the jobs running under that process disappeared. It seems to have also tried to restart at 16:45 and this time I managed to do a 'ps' command at exactly the right time and the following was displayed.

root 96000 169454 0 16:45:02 0:00 /usr/sbin/cron
root 169454 1 0 16:15:05 0:00 /usr/sbin/cron

It is as if the running cron process is trying to launch another cron process.

This time however the original cron run 16454 kept running as did all the child processes. There is no sign of process 96000.

Can any one suggest anything now??
 
make sure there is no cron/at jobs to restart cron

as root,
Code:
crontab -l | grep -v "#"
at -q

I am thinking its a script of somesort thats causing this. This is happening at :15 and :45, very suspcious.


 
Thanks you for pointing me in the right direction.

I have found the problem.

I have a "clearup" job that runs in the root cron at these minutes 15,25,35,45,55,05. Its function is to kill certain user logins if they have been active for more than 12 hours. Normally it would never find any. Any processes it does find it writes into a temporary work file called /tmp/$PPID then it kills each of the processes in this file one by one.

I have another job that runs in the root cron at these minutes 00,15,30,45. It also uses a temporary work file called /tmp/$PPID.

At minutes 15 and 45 these jobs run together. They write their respective work files using their $PPID which (and it was a suprise to me) are the same process numbers. If the timing is just wrong then my clearup job which has actually found no records tests the /tmp/$PPID file and finds that there are records in it (created by my second job). The clearup job loops round this work file thinking it is finding processes to kill and tries to kill them. I have found things in the log like
kill 11:01:01 - which fails
kill 00:00 - which fails
kill 0
kill 1

Where there are "kill 0" or "kill 1" commands found , these tie in exactly to when the crons were restarted.

So there you go - problem sorted - dump shell programming on my part.

Thanks again.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top