Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

8300 lsp up, but GW not responding

Status
Not open for further replies.

wpetilli

Technical User
May 17, 2011
1,877
US
I have an s8300/g450 lsp. there was a replacement of some network gear and currently the system is in standalone. I can ping and get to the 8300, but the GW is not pingable. It's pingable from the 8300, but not the network switch and the interfaces are up/up. I tried ssh'ng from the 8300 cli to the GW, but I'm only getting prompted for a pwrd and not the username. Any ideas what this can be? Going to try rebooting it again.
 
IP Address conflict? Maybe someone reused the G450 Ip on some of this new network gear which would also explain you getting an unexpected prompt/response?

-CL
 
bounced the unit and it came back fine. 1 thing that didn't happen were the h.323 phones didn't auto register back to the core and sat in discovery. These are 9611G's and we use dhcp to give the phone its list of IP registrations. I had to bounce the switch ports to force them to reboot. So the phones are not re-registering back right away on a network reconvergence. Not sure where to go with that one. This is the first time we had this issue in any site.
 
Sorry to pester on this one, but am having big issues at this site. My GW follows normal recovery rules and re-registers to the core, but the phones do not. They go discover and cycle through a few of the IP's, but never make the registration. They are no longer registered to the local 8300 so it's like they are floating in space until they are fully rebooted.
 
They aren't registered anywhere after the convergence. They get the registration IP's from DHCP and they are only of the core CM and then itself. They immediately de-register from the local LSP after the GW recovers and they don't register to the core.
 
the network team ran captures and are seeing the phones reaching the Clan IP in the core and getting a reply saying Request in Progress, but no registration is completed. Also, this Clan IP isn't the first IP in the MCIPPADD list in the DHCP options, so no idea why the phones are trying this IP. This Clan has no alarms and has other registrations on it. The team put an ACL that blocked the local site from being able to connect to this CLAN and after a few minutes all the phones then re-registered back to the PE. No idea why this happening.
 
In the packet capture it shows a H.225 Ras Message saying: gatekeeperReject (2) and reject reason: neededFeatureNotSupported. This comes from a CLAN as well as the PE. Pull the plug and reboot the phone and it comes up w/no problem.
 
Having zero luck making any ground with this. This weekend a different remote site off the same core went survivable due to a brief WAN outage. After the network recovery the GW registered back to the core with no issues, but the phones did not. They needed a full reboot to register back. Since the phones aren't registered anywhere for me to see the extensions/IP's the only way I'm able to remotely reboot them is to turn the inline power off and back on at the network switch.

Does busy/releasing /resetting the CLAN boards make sense to try? I'm thinking maybe they've been online for so long that they need a refresh.
 
Code:
status nr-registration all-regions                              Page   1 of  16

    NR  St     NR  St     NR  St     NR  St     NR  St     NR  St     NR  St

      1 en      19 en      37 en      55 en      73 en      91 en     109 en
      [b][COLOR=#CC0000]2 ad[/color][/b]      20 en      38 en      56 en      74 en      92 en     110 en

ad = auto disabled
rd = registration disabled
--------------------------------------------------------------------------------
disable nr-registration (1 - 2000)
[b][COLOR=#CC0000]enable nr-registration all[/color][/b]
enable mg-return all or network-region (1-250)
enable mg-return network-region 2 WARNING: There are no MGs in the region.
[b][COLOR=#CC0000]enable mg-return all (use this if no mg in nr 2)
[/color][/b]--------------------------------------------------------------------------------
status nr-registration all-regions                              Page   1 of  16

    NR  St     NR  St     NR  St     NR  St     NR  St     NR  St     NR  St

      1 en      19 en      37 en      55 en      73 en      91 en     109 en
      2 en      20 en      38 en      56 en      74 en      92 en     110 en
--------------------------------------------------------------------------------
phones are now registered
--------------------------------------------------------------------------------
If this does not work, next step is to change network-region for ip-network-map
of failing network region, submit, change it back.

display ip-network-map                                          Page   1 of  63
                               IP ADDRESS MAPPING

                                               Subnet Network     Emergency
 IP Address                                    Bits   Region VLAN Location Ext
 --------------------------------------------- ------ ------ ---- -------------
 FROM: 10.22.0.0                               /16    2      n
   TO: 10.22.255.255
--------------------------------------------------------------------------------

A great teacher, does not provide answers, but methods to teach others "How and where to find the answers"

bsh

40 years Bell, AT&T, Lucent, Avaya
Tier 3 for 30 years and counting
[URL unfurl="true"]http://bshtele.com[/url]
 
Everything in my table shows 'en'. I'm familiar with some of these commands. To test failover and failback of remote sites I regularly use the disable nr x and enable nr x. In the case of a real WAN outage where the GW is forced to isolation and the disable/enable nr x commmand I have the same problem of the phones failing to re-reg to the core. They register to the local s8300 with no problem, but on the failback they sit there un-registered anywhere and are saying discovering, while scrolling through some of the IP's from DHCP. While running packet captures the phones seem to be trying the same CLAN board over and over and getting an H.225 Ras Message saying: gatekeeperReject (2) and reject reason: neededFeatureNotSupported.

This behavior was never the case as all our sites work pretty seamless. The only event that did happen a few months back was a network switch failed where half of our IPSI/Medpro/CLAN and 1 call server was physically connected to. The servers interchanged automatically and we were never out of service because the other half of the core connections were on a redundant network switch. After the network switch was recovered all boards checked out as in service. I see sockets connected to all my clan boards, but for some reason these remote site phones need to be fully rebooted to re-register back and don't auto failback once the GW recovers.
 
These are commands to check for the network regions that phones are not registering back to the main.
This may work for you instead of full the reboot. These issues are due to bug in the Avaya that Avaya has not fixed.

A great teacher, does not provide answers, but methods to teach others "How and where to find the answers"

bsh

40 years Bell, AT&T, Lucent, Avaya
Tier 3 for 30 years and counting
[URL unfurl="true"]http://bshtele.com[/url]
 
So I'd have to wait for another event where a system gets isolated to use these commands? Since it has happened in more than 1 remote site I focused my attention more at the core location and starting with bouncing the CLAN's.
 
There was a WAN event where the site went isolated. The WAN recovered and now the MG says PD and the s8300 is still active. half the phones are back at the core and the other half still at the 8300.

stat nr-reg all shows en for everything.
 
Code:
display system-parameters ip-options                            Page   1 of   4
                          IP-OPTIONS SYSTEM PARAMETERS

 IP MEDIA PACKET PERFORMANCE THRESHOLDS
    Roundtrip Propagation Delay (ms)    High: 800      Low: 400
                    Packet Loss (%)     High: 40       Low: 15
                    Ping Test Interval (sec): 20
    Number of Pings Per Measurement Interval: 10
                  Enable Voice/Network Stats? ?
 RTCP MONITOR SERVER
   Server IPV4 Address: 135.122.44.95   RTCP Report Period(secs): 5
               IPV4 Server Port: 5005
   Server IPV6 Address:
               IPV6 Server Port: 5005

AUTOMATIC TRACE ROUTE ON
           Link Failure? y
                                     H.323 IP ENDPOINT
 H.248 MEDIA GATEWAY                  Link Loss Delay Timer (min): 5
  Link Loss Delay Timer (min): 5        Primary Search Time (sec): 75
                                Periodic Registration Timer (min): 20
                              Short/Prefixed Registration Allowed? y
display system-parameters ip-options                            Page   2 of   4
                          IP-OPTIONS SYSTEM PARAMETERS

 [b][COLOR=#CC0000]Force Phones and Gateways to Active LSPs? n
[/color][/b]

 IP DTMF TRANSMISSION MODE
   Intra-System IP DTMF Transmission Mode: rtp-payload
                     Inter-System IP DTMF: See Signaling Group Forms

 HYPERACTIVE MEDIA GATEWAY REGISTRATIONS
   Enable Detection and Alarms? n

A great teacher, does not provide answers, but methods to teach others "How and where to find the answers"

bsh

40 years Bell, AT&T, Lucent, Avaya
Tier 3 for 30 years and counting
[URL unfurl="true"]http://bshtele.com[/url]
 
That setting is what my system is set for. I have had LSP's off this core for years and have never had any of these issues until the past month or so. The only event that has happened, as I mentioned earlier is a network switch crapped the bed where 1/2 of the core boards and the active server were plugged into. The only thing that I can come back to is the servers interchanged. They interchanged clean and in 'status summary' on the web interface everything is as it should be. No alarms anywhere.

Today's issue was the first I've seen too where the network recovered, the GW was in a 'PD' state which means to me is in its process to recover back. That s8300 on the core showed y y as being active and half the phones were registered to it and the other half still on the 8300. The TG for the site was in-service on both the core and the 8300. I was going to do a 'reset sys 4' from the local s8300 but decided to wait it out. 20 minutes or so later it finally recovered, but the phones did not. They stayed discovering and had to be fully rebooted.
 
Did you try enable mg-return all

A great teacher, does not provide answers, but methods to teach others "How and where to find the answers"

bsh

40 years Bell, AT&T, Lucent, Avaya
Tier 3 for 30 years and counting
[URL unfurl="true"]http://bshtele.com[/url]
 
When the issue was in progress I didn't run that command... no. When I did the status nr-reg all they all were in 'en' status. I thought that enable mg-return all would only come into play if the status nr-reg all said 'ad'.
 
I bounced all the Clans at the core and disabled 2 different NR's and the failback worked as it should. Hopefully that was the trick with this saga.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top