Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Check for 2 processes and kill them 2

Status
Not open for further replies.

yowza

Technical User
Nov 28, 2001
121
US
Thanks to this forum and scripting Gurus, I have written my first script. The script basically runs as a cronjob every 15 minutes, checks to see if syslogd is running and logging messages to the message file. If the file size doesn't change, it assumes syslogd is not running and stops and starts it. If after 75 seconds it can't start it, it emails a message. Below is the script. What I need help with is:

1. How can I check to see if syslogd process is running twice and if so, kill those processes? I have seen where syslogd will not start when 2 processes are spawned. I know the ps command etc but don't know how to put it in this script.

2. The mail message doesn't work. I looked at the man page but still can't figure it out. I would like to put a lot of text in the message, basically a msg to tell them to call 2 people and provide the phone numbers etc.

3. I would appreciate any suggestions on how to improve this script. I intend to start writing alot of them and do not know the "better" or correct way of doing things.

4. When this script runs, is it hogging all the cpu time? When a process sleeps, does the system run other jobs?

Any help with the above questions is greatly appreciated!

#!/bin/ksh
#
# Script to check if the syslogd process is running and logging
# messages to the /var/adm/message file. A change in the file size of the
# message file indicates that logging is being performed.
# This script runs as a cron job every 15 minutes.
#
count=0
# Get initial filesize
first_time=`ls -al /var/adm/messages | awk '{print $5}'`
# Wait 15 secs. If file size hasn't changed,
# assume syslogd is not running. Stop and start it.
#
while [ count -le "4" ]
do
sleep 15 #wait 15 secs
next_time=`ls -al /var/adm/messages | awk '{print $5}'`
if [ "$first_time" == "$next_time" ]
then
count=$((count+1))
/etc/init.d/syslog stop # stop syslogd
/etc/init.d/syslog start # start syslogd
else
count=0 # reset count for next time.
exit 0 # get out of this script. Everything is ok.
fi
done
#
# If we get here, we have problems. Can't start syslogd.
if [ $count -gt "4" ]
then
mailx -s "SYSLOG SERVER NEEDS IMMEDIATE ATTENTION!!!!" someone@this-don't-work.com
exit 77
else
exit 88
fi


 
Q1:
Use ps and 'grep -c', -c=count:
Code:
ps ax | grep -c syslogd
2
ps ax | grep syslogd
  195 ?        S      0:00 /sbin/syslogd -m 0
  748 ttyp4    R      0:00 grep syslogd
You see: grep syslog finds itself, therefor if you look for 2 processes you will seek for result=3.

To store the number in a variable use 'let':
Code:
let NUM_OF_SYSLOGS=`ps ax | grep -c syslogd`

Q2: Can you mail a file?
You might create a template-mail like:
Code:
Dear $0,

Today, $1 the server $2 has the $3 - problem.
...
And generate the values at runtime.

Q4: You may test this with 'top'.

Q3: A very informative guide in using the bash is:

But it's focus is more on how to use the abilities of the bash, than in scripting-design.
 
If your OS is Solaris, you could use the pgrep
and/or pkill utilities to simplify your script.
 
1.
[ $(ps -e | awk '
$NF=="syslogd"{p[n++]=$1}
END{if(n==1){print 1;exit}
cmd="kill -15 ";for(i=0;i<n;++i)cmd=cmd" "p
system(cmd "2>/dev/null");print 0}
') -eq 0 ] && /etc/init.d/syslog start
2.
echo "
You have to call people1 and/or people2
blah blah
" | mailx -s "SYSLOG SERVER NEEDS IMMEDIATE ATTENTION!!!!" someone@this-does-work.com
3.
Unless you're in debug (or verbose) mode the assumption about syslog file size change within 15 minutes seems wrong to me.
4.
When a process sleeps it is inactive for the requested period and doesn't eat cpu resource.

Hope This Help, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884
 
Stefanwagner,
Thanks for the response! If you do a
ps ax | grep -v grep | grep -c syslogd
you get rid of the other grep, and it will return 1. I tried the URL and can't get to it. I'll poke around and see if I can find it.

yowza
 
PHV
Many thanks for the response. I don't understand exactly what #1 id doing (I mean step by step, not functional) but I have to use the Bourne Shell now so I will have to rewrite it, if I can. I didn't realize there was that much of a difference between the shells:)) My linux box doesn't have ksh for some reason.
We have everything (routers, switches, 29 other servers) going to the one syslog host. It gets written to every second so checking it at 15sec is not a problem to check for a change in the file size. I was concerned about eating up processing time while the cronjob runs, and scheduling it every 15 minutes.
Very helpful post PHV. Thanks again.

Yowza
 
yowza, here the #1 snippet in bourne-shell with step by step explanation:
Code:
[ `ps -e | awk '
 $NF=="syslogd"{p[n++]=$1}
 END{if(n==1){print 1;exit}
  cmd="kill -15 ";for(i=0;i<n;++i)cmd=cmd" "p[i]
  system(cmd "2>/dev/null");print 0}
'` -eq 0 ] && /etc/init.d/syslog start
The basic idea is to count in awk the number of syslogd processes, saying that all is OK (print 1) if only one found, killing all of them if more found and saying to restart the service (print 0) if zero or more than one syslogd discovered.
The skeleton is:
[ `ps -e | awk '...'` -eq 0 ] && /etc/init.d/syslog start
If the awk program terminates with 'print 0' then restart the syslog service.
Now, the awk program:
$NF=="syslogd"{p[n++]=$1}
If the last field of the ps -e command output is "syslogd" then store the PID ($1) in array p and add 1 to a counter (n++).
END{if(n==1){print 1;exit}
When all lines have been processed, if only one (n==1) syslogd process found then say OK (print 1) and exit.
cmd="kill -15 ";for(i=0;i<n;++i)cmd=cmd" "p[ignore][/ignore]
Prepare a shell command (cmd) to kill all the PID stored in array p.
system(cmd "2>/dev/null");print 0}
Execute the prepared shell command with no error reporting (in case of zero syslogd running) and then signal (print 0) the parent shell to restart the service.

Hope This Help, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884
 
Wow PHV, nobody could ask for more than that! Thanks very much for the detailed explanation. I really appreciate it. I have been looking on the internet for about an hour trying to find something that would help me decipher it.

Thanks again!!

Yowza
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top