Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Strange restart behavior 2

Status
Not open for further replies.

Tardigrade

Programmer
Aug 17, 2000
53
GB
Hello here's a wierd problem....

When I restart almost any service (daemon) I get the following message, ie with sendmail...

# /etc/rc.d/init.d/sendmail restart
Shutting down sendmail: kill: (466) - No such pid [FAILED]
Starting sendmail: [ OK ]

Sendmail was indeed process 466.

When I reboot, almost all services exit with the same 'no such pid' message. There doesn't seem to be any adverse effects to this behavior ie. everything works fine but it still worries me.

Any ideas?

PS. I'm running Red Hat 6.2
 
Do you have two instances of sendmail running after this happens?



Jay
 
I do get the same messages when shutting down. I don't get 2 instances of sendmail running.

If I try and stop a service I get the same message but the service dissapears from the process table as if the action has worked.

It does seem to be a purely 'cosmetic' error and doesn't seem to affect the operation of the machine but it still worries me severely.

Cheers
 
Tardigrade,

The reason I asked about shutdown is that the shutdown process is guaranteed to run as root -- so as you're getting the same problem there it's not likely to be a permissions problem.

If, as root, you do a 'kill 466' (466 being the sendmail PID, obviously) what happens then?
 
kill <sendmail pid> does as you would expect. I've had a look through the scripts in /etc/rc.d/init.d and instead of kill they use something called 'killproc' which I can find very little information on (theres no man pages for it on my machine or on any machines I can access).

Cheers
 
&quot;killproc&quot; is a shell function defined in the file /etc/rc.d/init.d/functions. This file is sourced into the various init.d scripts with a line like:
[tt]
. /etc/rc.d/init.d/functions
[/tt]
From what I can see, it gets a list of processes that are running for a named program and then tries to kill each process. It looks like it is picking up an extra PID from somewhere that doesn't exist after sendmail has been killed.

I don't have sendmail running on any of the Linux servers I have access to. However, one thing you could do is edit the &quot;functions&quot; file and turn on shell debugging. This will give you an idea of how it is getting the process list, and what could be done to fix things. (Or, at least, make sure that nothing horrible is going to happen if you leave things as they are.)

Before starting, copy &quot;functions&quot; to &quot;functions.orig&quot;, or something similar.

Look for the following line in the &quot;functions&quot; file:
[tt]
killproc() {
RC=0
[/tt]
This is at line 110 in the file on my Mandrake 7.1 box.

Immediately following this line add a new line:
[tt]
killproc() {
set -x # Turn on shell debugging.
RC=0
[/tt]

Now go to the end of the function. Look for the line that contains a single &quot;}&quot; character before the &quot;pidofproc&quot; function is defined.

Immediately before this line add a new line:
[tt]
return $RC
set +x # Turn off shell debugging.
}
[/tt]

Now run the sendmail script with &quot;stop&quot; and you will get some extra information sent to your screen. Hopefully, this will help diagnose what the problem is.

As I said, I've not got sendmail on any Linux servers that I have access to, so I've not been able to try this out. If you need any help deciphering the output that you get, could you copy and paste the output into a new post?

Hope this helps.
 
OK we're getting somewhere....

I've tried the above, heres the relevant section of the output:

[tt]+ pidlist=513
+ pid=
+ [ -d /proc/513 ]
+ pid= 513
+ [ 513 != ]
+ [ color = verbose ]
+ [ 1 = 1 ]
+ ps h 513
+ kill -TERM 513
+ usleep 100000
+ ps h 513
+ sleep 1
+ ps h 513
+ sleep 3
+ ps h 513
+ kill -KILL 513
kill: (513) - No such pid
+ ps h 513
+ RC=0
+ [ 0 -eq 0 ]
+ failure sendmail shutdown
+ rc=0
+ [ -z ]
+ initlog -q -n /etc/rc.d/init.d/sendmail -s sendmail shutdown -e 2
+ [ color != verbose ]
+ echo_failure[/tt]

and heres the relevant section of the init.d script.....

[tt]if [ &quot;$notset&quot; = &quot;1&quot; ] ; then
if ps h $pid>/dev/null 2>&1; then
# TERM first, then KILL if not dead
kill -TERM $pid
usleep 100000
if ps h $pid >/dev/null 2>&1 ; then
sleep 1
if ps h $pid >/dev/null 2>&1 ; then
sleep 3
if ps h $pid >/dev/null 2>&1 ; then
kill -KILL $pid
fi
fi
fi
fi
ps h $pid >/dev/null 2>&1
RC=$?
[ $RC -eq 0 ] && failure &quot;$base shutdown&quot; || success &quot;$base shutdown&quot;
RC=$((! $RC))[/tt]

it seems that my version of ps is not working, when I type 'ps h <pid>' at the command line I get...

[tt]ERROR: Unsupported option (BSD syntax)[/tt]

Which fools the script into thinking that the process is still there! the next problem is how do I fix ps !?!

Thanks for all the help thats got me this far.
 
OK, I'm not sure which version of 'ps' you will be running. I'm working on a Mandrake version of Linux, which is based on RedHat, but it looks like we've got different versions of 'ps'. My 'ps' understands the &quot;h&quot; option... In fact, I thought that the standard 'ps' command shipped with most Linux distributions was the BSD-like version that was part of the &quot;procps&quot; package.

Can you run a couple of commands for me? First, what do you get if you enter &quot;which ps&quot; at the command line. Next, as &quot;root&quot; run &quot;rpm --query -f /bin/ps&quot;. (Change the path to &quot;ps&quot; to be the one that was provided by the &quot;which&quot; command.) This will tell you which package is providing the &quot;ps&quot; command. If it's procps, then there is something wrong, and you might want to try upgrading to the latest version of procps. If it isn't procps, can you post back with the name of the package?

If you want to install/reinstall procps, you should be able to find it on the RedHat 6.2 CD. If you can't find it, you can download the latest procps package from
Let us know how you get on.
 
Thank you, thank you, thank you.

I did as you said and it seemed that I already had the latest version of procps. So after a bit of fun and games resinstalling the procps package (/bin/ps had an 'immutable' flag set on it), It worked! everything restarts without a peep.

I can only assume that ps got corrupted somehow although e2fsck didn't show up any problems.

Thanks again
 
I'm glad it's all working again :) Just out of interest, how did you remove the &quot;immutable&quot; flag from /bin/ps? Did you use chattr?
 
I had the same problem and re-installing the procps-xx.rpm worked perfectly.

My problem occured during the process of recompiling the kernal to 2.2.16 in case that helps anyone else with the same problem

Thanks guys :)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top