Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

ksh script challange 1

Status
Not open for further replies.

squash

MIS
May 21, 2001
99
US
Ok, I know I have asked for simple solution's in the past to more than simple problems, but here goes.

I have a simple script. Here is the basics of it.

while :
do
/usr/local/bin/selftest |grep fail
date
sleep 120
done

Now the selftest part is a compiled program written long ago by a developer no longer around. It basically tests connections to many other boxes. (IE. xcom, mq, sna, mainframe, etc) and returns either an ok or fail for each test.
While running it quite often hangs and will stay hung until I ,<ctrl>c out of it, thus my script also hangs. Now my actual script is a bit more complex in that it beeps if a fail is recieved. It works great. However if the selftest hangs I do not know untill the next time I look at the screen. That is why i have date command so I can easily see if it is current.

Now for the question. I would like to have some way to automatically tell if the selftest is hung and then kill that pid and restart my script. Not sure what avenue to look.

My best thinking is some sort of timer in my script that counts up to maybe 2 minutes then resets and if the counter does not reset then restart the script.

As always we thank you for you support.

Doug
 
Hi sqash,

I hope that there is a solution simpler than mine ... X-)


#
# KillProcesses - Kill al list of processes and all their childs
# $* = Pid list
#

KillProcesses () {
typeset ppid Childs
for ppid in $*
do
Childs=$(ps -f | awk -v ppid=$ppid '$3==ppid {print $2}')
[ -n &quot;$Childs&quot; ] && KillProcesses $Childs
kill $ppid
done
}

#
# ExecWithTimeout - Execute a command with timeout
# $1 = Timeout value
# $2- = Command
# If the command doesn't execute within the timout interval,
# the process is killed and a status of 255 is returned
#

ExecWithTimeout () {
set -vx
Fifo=/tmp/fifo.$$
rm -f $Fifo
mkfifo $Fifo
Timeout=$1
shift
Command=&quot;$*&quot;
eval &quot;( $Command ; echo $? > $Fifo ) &&quot;
CPid=$!
eval &quot;( sleep $Timeout ; echo 255 > $Fifo ) &&quot;
TPid=$!
read Status < $Fifo
KillProcesses $CPid $TPid > /dev/null 2>&1
rm -f $Fifo
return $Status
}

#
# The original loop
# selftest must termines within 3mn
#

while :
do
ExecWithTimeout 180 &quot;/usr/local/bin/selftest | grep fail&quot;
[ $? -eq 255 ] && echo &quot;seltest failed !!!&quot;
date
sleep 120
done
Jean Pierre.
 
Jean Pierre,
I read your post last night and it was a bit much to digest, Today however a new light is upon me and I see some majik here.
I will play with this on my qa box and see what I can make of it.
I have a couple of questions if I may be so bold, and probably a few more later.

1. in the last stanza for the original loop you have line 4 [ $? -wq 255 ] && echo &quot;selftest failed !!!&quot;. Could you explain this to me, I am haveing trouble digesting,
Thankx
Doug
 
Hi Doug,

The statement :

[ $? -eq 255 ] && echo &quot;seltest failed !!!&quot;

is equivalent in:

if [ $? -eq 255 ]
then
echo &quot;seltest failed !!!&quot;
fi

The function ExecWithTimeout returns a status ($?) of 255 if the command executed is in timeout.


In the same way,

[ $? -ne 255 ] || && echo &quot;seltest failed !!!&quot;

is equivalent in :

if [ $? -ne 255 ]
then
: # NOP
else
echo &quot;seltest failed !!!&quot;
fi
Jean Pierre.
 
You can write your own kornshell script to
see if various servers are up, with no hangs
using the following


for REMOTE_HOST is (list your servers here)
do
ping ${REMOTE_HOST} -n 5 > ${TMP_FILE}
ERR=`grep -c &quot; 0% packet loss&quot; $TMP_FILE`
if [[ $ERR != 1 ]] ; then
(action if no connection)
fi
done


The command ping (host) -n 5 will only ping
the host 5 times, you will get a return message of 0%, 20% 40% 60% 80% or 100% packet
loss.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top