Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

HACMP Network Problem 1

Status
Not open for further replies.

khalidaaa

Technical User
Jan 19, 2006
2,323
BH
Hi All,

I was strange error messages from an HACMP two node-cluster, it seems that the standby link goes down for a while and then it goes up again!!! (having two links in each node: one for service link and the other is the standby)

i was not able to telnet to the machine as well!!!

i was getting these messages from the terminal window on HMC:

{code]
Boot communication interface with address 192.14.1.150 is no longer available for use, due to either a boot communication interface failure or IP address takeover.

Boot communication interface 129.14.1.150 is now available.
[/code]


I rebooted both nodes and started the cluster again. I haven't got any messages yet about the standby link but i still can't telnet to the machine!!!

any idea why this is happening?

Regards
Khalid
 
hi ,
are you getting any errors in error report errpt -a | m regarding the network interface experiencing the problem

in /hacmp.out are there messages indicating swap adapter events ?

Are the speeds of the network cards and if connected to a switch set up the same i.e. check on server ( entstat -d <interface>

Are the link lights on , on the network interface and the switch / hub ?

Could it be you have a dodgy rj45 cable ?


hope this gives you fe areas to look at if not looked at already ?
 
oh yeah

i've been getting this error message continousally in the errpt

Code:
LABEL:          TS_UNS_SIN_TR
IDENTIFIER:     E7A89C7D

Date/Time:       Wed May 17 15:18:24 SAUST 2006
Sequence Number: 1188
Machine Id:      00CF359F4C00
Node Id:         s1cdbp
Class:           U
Type:            UNKN
Resource Name:   topsvcs         
Resource Class:  NONE
Resource Type:   NONE
Location:        
VPD:             

Description
Local adapter disabled after unstable singleton for long time

Probable Causes
Adapter misconfigured
Adapter can receive broadcast but not unicast messages
Adapter's IP address used by another adapter

Failure Causes
Network problem

        Recommended Actions
        Verify adapter configuration
        Verify network connectivity

Detail Data
DETECTING MODULE
rsct,threephs.C,1.191,5602                    
ERROR ID 
6xl7eb1EGlO2/UH1.K4U1/0...................
REFERENCE CODE
                                          
Adapter interface name
en0
Adapter offset
           1
Adapter IP address
129.14.1.150

and in the /tmp/hacmp.out i was getting these error messages:

Code:
May 17 15:31:50 EVENT START: fail_standby s1cdbp 129.14.1.150

:fail_standby[109] [[ high = high ]]
:fail_standby[109] version=1.8.1.5
:fail_standby[110] :fail_standby[110] cl_get_path
HA_DIR=es
:fail_standby[112] NODENAME=s1cdbp
:fail_standby[113] ADDR=129.14.1.150
:fail_standby[115] [ 2 -ne 2 ]
:fail_standby[121] set -u
:fail_standby[123] :fail_standby[123] dspmsg scripts.cat 335 Boot communication interface wit
h address 129.14.1.150 is no longer available for use,\n due to either a boot communication i
nterface failure or IP address takeover.\n 129.14.1.150
MSG=Boot communication interface with address 129.14.1.150 is no longer available for use,
 due to either a boot communication interface failure or IP address takeover.
:fail_standby[124] echo Boot communication interface with address 129.14.1.150 is no longer a
vailable for use, due to either a boot communication interface failure or IP address takeover
.
:fail_standby[124] 1> /dev/console
:fail_standby[126] [[ s1cdbp = s1cdbp ]]
:fail_standby[129] saveNSORDER=UNDEFINED
:fail_standby[130] NSORDER=local
:fail_standby[130] export NSORDER
:fail_standby[134] delete_standby_route 129.14.1.150
:fail_standby[3] set -u
:fail_standby[5] down_if=129.14.1.150
:fail_standby[7] read a b NETWORK c
:fail_standby[7] cllsif -Sn 129.14.1.150
:fail_standby[11] IFS=:
:fail_standby[12] read adapt type network net_type attrib node ip_addr hw_addr
:fail_standby[12] cllsif -Sc
:fail_standby[12] [[ /dev/hdisk2 = 129.14.1.150 ]]
:fail_standby[12] [[ service != standby ]]
:fail_standby[16] continue
:fail_standby[12] read adapt type network net_type attrib node ip_addr hw_addr
:fail_standby[12] [[ 10.1.1.167 = 129.14.1.150 ]]
:fail_standby[12] [[ boot != standby ]]
:fail_standby[16] continue
:fail_standby[12] read adapt type network net_type attrib node ip_addr hw_addr
:fail_standby[12] [[ 10.1.1.150 = 129.14.1.150 ]]
:fail_standby[12] [[ service != standby ]]
:fail_standby[16] continue
:fail_standby[12] read adapt type network net_type attrib node ip_addr hw_addr
:fail_standby[12] [[ 129.14.1.150 = 129.14.1.150 ]]
:fail_standby[14] continue
:fail_standby[12] read adapt type network net_type attrib node ip_addr hw_addr
:fail_standby[12] [[ /dev/hdisk2 = 129.14.1.150 ]]
:fail_standby[12] [[ service != standby ]]
:fail_standby[16] continue
:fail_standby[12] read adapt type network net_type attrib node ip_addr hw_addr
:fail_standby[12] [[ 10.1.1.151 = 129.14.1.150 ]]
:fail_standby[12] [[ boot != standby ]]
:fail_standby[16] continue
:fail_standby[12] read adapt type network net_type attrib node ip_addr hw_addr
:fail_standby[12] [[ 129.14.1.151 = 129.14.1.150 ]]
:fail_standby[12] [[ standby != standby ]]
:fail_standby[12] [[ net_ether_01 != net_ether_01 ]]
:fail_standby[12] [[ s2cdbs != s1cdbp ]]
:fail_standby[20] continue
:fail_standby[12] read adapt type network net_type attrib node ip_addr hw_addr
:fail_standby[37] unset IFS
:fail_standby[39] return 0
:fail_standby[138] :fail_standby[138] cllsif -Scn 129.14.1.150
:fail_standby[138] cut -d: -f3
NETWORK=net_ether_01
:fail_standby[141] :fail_standby[141] odmget -qname=net_ether_01 HACMPnetwork
:fail_standby[141] sed s/"//g
:fail_standby[141] awk $1 == "alias" {print $3}
ALIASING=2
:fail_standby[141] [[ 2 = 1 ]]
:fail_standby[141] [[ UNDEFINED != UNDEFINED ]]
:fail_standby[149] export NSORDER=
:fail_standby[153] exit 0
May 17 15:31:50 EVENT COMPLETED: fail_standby s1cdbp 129.14.1.150 0

                        HACMP Event Summary
Event: fail_standby s1cdbp 129.14.1.150 
Start time: Wed May 17 15:31:50 2006

End time: Wed May 17 15:31:50 2006

Action:         Resource:                       Script Name:
----------------------------------------------------------------------------
No resources changed as a result of this event
----------------------------------------------------------------------------

May 17 15:33:51 EVENT START: join_standby s1cdbp 129.14.1.150

:join_standby[111] [[ high = high ]]
:join_standby[111] version=1.8.1.3
:join_standby[112] :join_standby[112] cl_get_path
HA_DIR=es
:join_standby[114] NODENAME=s1cdbp
:join_standby[115] ADDR=129.14.1.150
:join_standby[117] [ 2 -ne 2 ]
:join_standby[123] set -u
:join_standby[125] :join_standby[125] dspmsg scripts.cat 330 Boot communication interface 129
.14.1.150 is now available.\n 129.14.1.150
MSG=Boot communication interface 129.14.1.150 is now available.
:join_standby[126] echo Boot communication interface 129.14.1.150 is now available.
:join_standby[126] 1> /dev/console
:join_standby[128] [[ s1cdbp = s1cdbp ]]
:join_standby[130] addback_route 129.14.1.150
:join_standby[3] :join_standby[3] cllsif -cSn 129.14.1.150
:join_standby[3] cut -d: -f3
:join_standby[3] uniq
NETWORK=net_ether_01
:join_standby[5] standby=129.14.1.150
:join_standby[10] clgetif -n 129.14.1.150
:join_standby[10] 1> /dev/null 2>& 1
:join_standby[11] [ 0 != 0 ]
:join_standby[16] :join_standby[16] clgetif -n 129.14.1.150
:join_standby[16] LANG=C
NETMASK=255.255.255.0 
:join_standby[17] :join_standby[17] clgetif -a 129.14.1.150
:join_standby[17] LANG=C
INTERFACE=en0 
:join_standby[22] addr=i129.14.1.150_s1cdbp
:join_standby[23] :join_standby[23] echo i129.14.1.150_s1cdbp
:join_standby[23] sed -e s/[./]/x/g
addr=i129x14x1x150_s1cdbp
:join_standby[24] VAR=$i129x14x1x150_s1cdbp
:join_standby[25] set +u
:join_standby[26] :join_standby[26] eval echo $i129x14x1x150_s1cdbp
:join_standby[26] echo UP
VAL=UP
:join_standby[27] set -u
:join_standby[29] [ UP != UP ]
:join_standby[37] ifconfig en0 129.14.1.150 netmask 255.255.255.0 up
:join_standby[133] exit 0
May 17 15:33:51 EVENT COMPLETED: join_standby s1cdbp 129.14.1.150 0

                        HACMP Event Summary
Event: join_standby s1cdbp 129.14.1.150 
Start time: Wed May 17 15:33:51 2006
End time: Wed May 17 15:33:51 2006

Action:         Resource:                       Script Name:
----------------------------------------------------------------------------
No resources changed as a result of this event
----------------------------------------------------------------------------

Code:
# entstat -d en1
-------------------------------------------------------------
ETHERNET STATISTICS (en1) :
Device Type: 10/100/1000 Base-TX PCI-X Adapter (14106902)
Hardware Address: 00:11:25:bf:3a:08
Elapsed Time: 0 days 1 hours 19 minutes 49 seconds

Transmit Statistics:                          Receive Statistics:
--------------------                          -------------------
Packets: 14977                                Packets: 49082
Bytes: 3320807                                Bytes: 13634852
Interrupts: 0                                 Interrupts: 44618
Transmit Errors: 0                            Receive Errors: 0
Packets Dropped: 1                            Packets Dropped: 0
                                              Bad Packets: 0
Max Packets on S/W Transmit Queue: 12        
S/W Transmit Queue Overflow: 0
Current S/W+H/W Transmit Queue Length: 1

Broadcast Packets: 74                         Broadcast Packets: 31219
Multicast Packets: 0                          Multicast Packets: 478
No Carrier Sense: 0                           CRC Errors: 0
DMA Underrun: 0                               DMA Overrun: 0
Lost CTS Errors: 0                            Alignment Errors: 0
Max Collision Errors: 0                       No Resource Errors: 0
Late Collision Errors: 0                      Receive Collision Errors: 0
Deferred: 0                                   Packet Too Short Errors: 0
SQE Test: 0                                   Packet Too Long Errors: 0
Timeout Errors: 0                             Packets Discarded by Adapter: 0
Single Collision Count: 0                     Receiver Start Count: 0
Multiple Collision Count: 0
Current HW Transmit Queue Length: 1

General Statistics:
-------------------
No mbuf Errors: 0
Adapter Reset Count: 0
Adapter Data Rate: 2000
Driver Flags: Up Broadcast Running 
        Simplex 64BitSupport ChecksumOffload 
        PrivateSegment LargeSend DataRateSet 

10/100/1000 Base-TX PCI-X Adapter (14106902) Specific Statistics:
-----------------------------------------------------------------
Link Status : Up
Media Speed Selected: Auto negotiation
Media Speed Running: 1000 Mbps Full Duplex
PCI Mode: PCI-X (100-133)
PCI Bus Width: 64-bit
Latency Timer: 144
Cache Line Size: 128
Jumbo Frames: Disabled
TCP Segmentation Offload: Enabled
TCP Segmentation Offload Packets Transmitted: 229
TCP Segmentation Offload Packet Errors: 0
Transmit and Receive Flow Control Status: Disabled
Transmit and Receive Flow Control Threshold (High): 45056
Transmit and Receive Flow Control Threshold (Low): 24576
Transmit and Receive Storage Allocation (TX/RX): 16/48

Code:
# entstat -d en0
-------------------------------------------------------------
ETHERNET STATISTICS (en0) :
Device Type: 10/100/1000 Base-TX PCI-X Adapter (14106902)
Hardware Address: 00:11:25:bf:36:b9
Elapsed Time: 0 days 1 hours 28 minutes 58 seconds

Transmit Statistics:                          Receive Statistics:
--------------------                          -------------------
Packets: 6715                                 Packets: 43651
Bytes: 1128542                                Bytes: 9232013
Interrupts: 0                                 Interrupts: 41267
Transmit Errors: 0                            Receive Errors: 0
Packets Dropped: 0                            Packets Dropped: 0
                                              Bad Packets: 0
Max Packets on S/W Transmit Queue: 10        
S/W Transmit Queue Overflow: 0
Current S/W+H/W Transmit Queue Length: 0

Broadcast Packets: 13                         Broadcast Packets: 36165
Multicast Packets: 0                          Multicast Packets: 533
No Carrier Sense: 0                           CRC Errors: 0
DMA Underrun: 0                               DMA Overrun: 0
Lost CTS Errors: 0                            Alignment Errors: 0
Max Collision Errors: 0                       No Resource Errors: 0
Late Collision Errors: 0                      Receive Collision Errors: 0
Deferred: 0                                   Packet Too Short Errors: 0
SQE Test: 0                                   Packet Too Long Errors: 0
Timeout Errors: 0                             Packets Discarded by Adapter: 0
Single Collision Count: 0                     Receiver Start Count: 0
Multiple Collision Count: 0
Current HW Transmit Queue Length: 0

General Statistics:
-------------------
No mbuf Errors: 0
Adapter Reset Count: 0
Adapter Data Rate: 2000
Driver Flags: Up Broadcast Running 
        Simplex 64BitSupport ChecksumOffload 
        PrivateSegment LargeSend DataRateSet 

10/100/1000 Base-TX PCI-X Adapter (14106902) Specific Statistics:
-----------------------------------------------------------------
Link Status : Up
Media Speed Selected: Auto negotiation
Media Speed Running: 1000 Mbps Full Duplex
PCI Mode: PCI-X (100-133)
PCI Bus Width: 64-bit
Latency Timer: 144
Cache Line Size: 128
Jumbo Frames: Disabled
TCP Segmentation Offload: Enabled
TCP Segmentation Offload Packets Transmitted: 24
TCP Segmentation Offload Packet Errors: 0
Transmit and Receive Flow Control Status: Disabled
Transmit and Receive Flow Control Threshold (High): 45056
Transmit and Receive Flow Control Threshold (Low): 24576
Transmit and Receive Storage Allocation (TX/RX): 16/48

and yes the lights are on!!!

Regards,
Khalid
 
I see you have the cards set to auto negotiate ( 1GB) ? are they connected to a switch ? if so what speed is the port set to on the switch ?
if possible don't set the speec on card to auto negotiate force the speed i.e. if 1GB then set port to 1GB , nfortunately to change this you have to down the interface
 
Thank you DSMARWAY :)

after long hours from trouble shooting i found out that the problem was with the IP address of the interface itself!!!

it was supposed to have the IP address of the service address but for some reason it was setup to the boot address!!! so when i changed that it worked..

i had to reboot the system as well to get ride of the annoying down and up of the standby interface..

Thanks alot
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top