Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

tcpip network services not starting after reboot 1

Status
Not open for further replies.

LeeHagen

IS-IT--Management
Sep 20, 2002
22
0
0
US
I modified the /etc/rc.tcpip file to remove the comment in front of the named service. I then rebooted the server and expected named to start up. Not only did named not start up, all the other services which should start up that are after the named line in /etc/rc.tcpip didn't start up either. Before I changed /etc/rc.tcpip, I made a copy of it so I copied the old /etc/rc.tcpip to /etc/rc.tcpip and rebooted expecting all the services that were running previously to start but nothing past named still starts. Smit doesn't have any problems reading /etc/rc.tcpip and it starts all the services that don't get started during the reboot when I run "Start configured tcpip daemons". What could possibly be wrong here. We are running AIX 5.1 on Regattas.
 
Here is some additional information that I discovered. First, the rc.tcpip script runs OK from the command line and all the services that are supposed to start up are now running. Second, the rc.tcpip script appears to run from the /etc/inittab script. Here is the line in inittab that starts the rc.tcpip script....

rctcpip:a:wait:/etc/rc.tcpip > /dev/console 2>&1 # Start TCP/IP daemons

Third, the system resource controller also get started by /etc/inittab. This is the line that starts that process.....

srcmstr:23456789:respawn:/usr/sbin/srcmstr # System Resource Controller

For some reason the /etc/rc.tcpip script is not completing during bootup but works just fine from the command line.

Regards,

Lee Hagen
 
a in the inittab implies that this is a HACMP cluster. Entries at runlevel a will not be run until runlevel a is reached (spooky that).

You have to start HACMP to get to runlevel a.

Also if this is the standby node in a rotating cluster runlevel a is not reached until this node starts running a resource group (ie, a failover takes place to this node).
If this is not a true 'idle' standby (ie, it's doing some other non-HA stuff until you want it to handle a failover) then you will have to start these services yourself - usually done as part of your non-HA application startup script.
 
Thanks sectorseven, for the runlevel tip, I don't "do" HACMP, just SP so never came across that before.
 
AIXSPadmin - No worries.

LeeHagen - Before you start doing things to HA boxes you really should get some HACMP training & documentation. Start with:

You probably want to start with the Certification study guides as they give good overviews of the stuff you need to know for certification. These guides are useful in the real world. The other redbooks will be even more useful once you've got the basics down and they are even more useful in the real world.

Try to get your company to punt up some money for the HACMP training courses, they are worth it.
 
I've already read just about every document on HA including the documentation that came with HA and gets loaded during the installation. I've also read thru all the documentation on HMC. I'm scheduled to get outside training some time this year. It's really not all that difficult though. The real problem is that there are several thousand pages of documentation and it took me almost a month to get thru it and fully understand how it applies to our systems. We have it running quite well now and the servers including the apps fail over with no problems. Our next step will be creating multiple profiles so we can manage resources better during peak and off peak hours for each app.
 
Now that the discussion on HA is out of the way, is there anyone out there that has any idea why the /etc/rc.tcpip script is not running to completion during bootup. Like I said, it runs just fine from the command line and I also ran it in error checking mode and there were no errors reported except for the daemons that were already running which is to be expected anyways. Someone else certainly must have run into this same problem. All the experts at IBM seem to be baffled by this problem. It's getting frustrating, I'll gladly take educated guesses at this point.


Regards,

Lee Hagen
 
LeeHagen - The discussion on HA is not out of the way because that is the cause of your perceived problem. I've said this before and I will say it again. Your rc.tcpip entry in inittab is set to run at runlevel a. You will not be at runlevel a until HACMP is started and a service address is acquired. If you do a:
grep ':a:' /etc/inittab

you should see the following (note, this on a system running HA classic, if you are running HA/ES there will be an es between the //):
rctcpip:a:wait:/etc/rc.tcpip > /dev/console 2>&1 # Start TCP/IP daemons
rcnfs:a:wait:/etc/rc.nfs > /dev/console 2>&1 # Start NFS Daemons
qdaemon:a:wait:/usr/bin/startsrc -sqdaemon
writesrv:a:wait:/usr/bin/startsrc -swritesrv
clinit:a:wait:/bin/touch /usr//sbin/cluster/.telinit # HACMP for AIX These must be the last entries in inittab!
pst_clinit:a:wait:/bin/echo Created /usr//sbin/cluster/.telinit > /dev/console # HACMP for AIX These must be the last entries in inittab!

HACMP itself will do a 'telinit a' during its startup and will wait until it can see the file /usr//sbin/cluster/.telinit that is how it knows that runlevel a is achieved. You can see this yourself by examining the hacmp.out files on a running cluster. ie:
grep 'telinit a' /tmp/hacmp.out*

You should see that this command is run from acquire_service_addr.

The other possibility is that this box is not meant to be running HA but the rootvg image was cloned from a HA box. In this case you need to clean up the inittab to remove the clinit & pst_clinit entries and then change the other entries at runlevel a to be at runlevel 2.
 
sectorseveng,

That was exactly the problem. I should have read your previous response more carefully. This server is a test server that HA has been temporarily disabled on. I made the fixes to the /etc/inittab file and life is now great. Thanks much for your help.

Regards,

Lee Hagen
 
LeeHagen - No worries, sorry if my earlier replies came off as condescending but I take HA seriously.

I've spent this week writing some documentation on HACMP for use by the companies trainee/junior sysadmins. The first line of this documentation says "Do not even think of going anywhere near one of my production cluster builds unless you have read and understood the design and formulated a plan of action". :)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top