aix+hacmp: clstrmgr crashes on start 1

cettolox · Aug 29, 2008

Hi,

the clstrmgrES daemon crashes as soon as it is started.

I see it from the log:

lpar6# cat /tmp/clstrmgr.debug
Fri Aug 29 12:09:50 HACMP/ES Cluster Manager Version 5.3
Using ODMDIR=/etc/es/objrepos
Fri Aug 29 12:09:50 HA_DOMAIN_TYPE=HACMP
Fri Aug 29 12:09:50 ReadTopsvcs: called.
Fri Aug 29 12:09:50 GetObjects: Called with criteria:
Fri Aug 29 12:09:50 ReadTopsvcs: hbInterval = 1, fibrillateCount = 4, fixedPriLevel = 38, runFixedPri = 1 instanceNum = 20
Fri Aug 29 12:09:50 ReadTopsvcs: Calculated fixed priority is 39
Fri Aug 29 12:09:50 /usr/es/sbin/cluster/clstrmgr: Unrecognized argument '?'.
Fri Aug 29 12:09:50 die: clstrmgr on node 0 is exiting with code 2

The problem is the "unrecognized argument"... I cannot tell which argument is it talking about.

I read it could be caused by a name resolution issue, but I cannot investigate any deeper.

Any hint?

Thanks in advance,

/Stefano

khalidaaa · Aug 29, 2008

What's in /etc/hacmp.out? Any thing useful?

You would be able to restrict the problem by chasing backward in /tmp/hacmp.out for the event that failed!

Is that a new cluster? Was it working before? What changes have you done lately?

Regards,
Khalid

cettolox · Sep 1, 2008

In hacmp.out there is nothing useful. The cluster is new, it has been installe on the 1st of August and worked. Then I do not know what is happened: what I see now is that the clstrmgrES daemon (that should always be on) cannot start: it crashes 1 second after been started with the "unknown argument" that I described in the 1st post.

I found a post that seems related to my problem, but I cannot understand the answer. This is the post, maybe someone gets a hint.

Thanks,
/Stefano

Question
When i analyse diffrente message i think that is a DNS problem
Recorded using libct_ffdc.a
/usr/es/sbin/cluster/clstrmgr: Unrecognized argument '?'.
Unexpected termination of clstrmgrES
Halting system immediately!!!
Any idea?

Answer
i advise u that u update ur /etc/services first :
node1 @IP
node1_boot @IP
node1_stdby @IP
by this u'll bypass DNS

DSMARWAY · Sep 1, 2008

Hi

Have you created your cluster , does it sync without any errors ?

what does lssrc -g cluster ( report)

does errpt -a ( report any errors for clstrmgrES)

Are you using DNS ? if so can you temporarily not use it , if you think its a DNS problem and the start your cluster

Have you got the latest fixes for HACMP 5.3 ?

cettolox · Sep 1, 2008

It seems I have solved the problem... i few minutes I'll describe what I did.

/Stefano

cettolox · Sep 1, 2008

The problem was that somehow the clstrmgrES module was configured to start with an incorrect parameter: in the startup argument there was the "-d" switch but it was not followed by a number (that is the debug level).

I performed the following steps.

1 - check the parameters of the modules

lpar6# odmget -q "subsysname like clstrmgrES" SRCsubsys
SRCsubsys:
subsysname = "clstrmgrES"
synonym = ""
cmdargs = "-d"
path = "/usr/es/sbin/cluster/clstrmgr"
uid = 0
auditid = 0
standin = "/dev/null"
standout = "/dev/null"
standerr = "/dev/null"
action = 2
multi = 0
contact = 3
svrkey = 0
svrmtype = 0
priority = 20
signorm = 0
sigforce = 0
display = 0
waittime = 15
grpname = "cluster"

2 - redirected the output to file to see what happened

chssys -s clstrmgrES -o /tmp/output.log -e /tmp/error.log

3 - see the log to understand (and to find that the problem was related to the "debug" switch

lpar6# more /tmp/error.log
/usr/es/sbin/cluster/clstrmgr: A flag requires a parameter: d

lpar6# more /tmp/output.log
/usr/es/sbin/cluster/clstrmgr: Unrecognized argument '?'.
Usage: clstrmgrES [-d debug_level]
-d debug_level Set the debugging level
-f log length Set the max log length
-p priority Set the process priority
-v version Set the cluster version
-w wait Set the stabilization wait time

4 - delete the "-d" from the argument list

chssys -s clstrmgrES -a ""

5 - restart all: it works now.

khalidaaa · Sep 1, 2008

Have a star for updating the problem!

Regards,
Khalid

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

aix+hacmp: clstrmgr crashes on start 1

cettolox

IS-IT--Management

khalidaaa

Technical User

cettolox

IS-IT--Management

DSMARWAY

Technical User

cettolox

IS-IT--Management

cettolox

IS-IT--Management

khalidaaa

Technical User

Similar threads

Part and Inventory Search

Sponsor