Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Resilient problem

Status
Not open for further replies.

Ceaserx

Technical User
Apr 14, 2008
119
ZA
Hi

I have a client with 2 x MXE controllers on version MCD 6 SP2 that are in a cluster and set up for resiliency. The main controller randomly fails over to the resilient controller during the day.

What could cause this? The switches don't indicate a connection drop and we can ping the controller in that time. Any ideas?
 
The sets will fail when they lose "sight" of the primary controller. The problem can be routing on the network in that the sets send a status message to the controller and it gets lost on the way or the return message gets lost on the way back to the set. Have seen incorrect routing on the network cause this and its not something that the layer two switches would report on.It doesn't necessarily need to be a connection drop to have the sets fail. Have also seen some posts in this forum regarding the embedded layer 2 port on an MXe controller going defective which might also be the problem.

You can't duct tape stupid.
 
What do the logs in the MCD's tell you? They should put up messages if the primary MCD loses connectivity with a phone, or if the heartbeat messages between the controllers are failing. The MCD's do a health check to each other every 60 seconds (by default), 5 failures (again, the default setting, both the # of failures and time between checks is configurable in System Options) and it triggers a failover.
 
thanks for the feedback

I am going to see if I can adjust the timers and see if that helps in the mean time. The strange thing is that it seems to stabilize after rebooting the main controller for a couple of days. I will log a call with Mitel to help investigate.
 
Definitely check the logs in the primary MCD. I don't think changing the timers are going to be a big help here, it sounds like there's something else going on but without seeing the logs, it's hard to say what.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top