Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

CUCX 8x Split Brain & MTA & Notifier services stop if path lost to sub

Status
Not open for further replies.

MitelInMyBlood

Technical User
Apr 14, 2005
1,990
US
I'm trying to figure out if something is misconfigured between my Unity Connection (voice mail) pub & sub.

The pub is local (in house) and the sub is remote (offsite). the connection between sites is a leased gig circuit with a DS3 (45 Mbps) backup. There is normally a lot of traffic going across the the path, but never before a problem.

Yesterday we lost the GIG pipe for several hours & the network traffic failed over to a backup DS3, swamping it. Latency, which is normally 4 or 5 ms was suddenly 2300~2500 ms. I understand this.

During the time the gig path was down we began getting complaints of delayed MWI and delayed message delivery.

The gig path was restored sometime overnight, but when I came in this morning there was no MWI working & no messages being delivered, although CUCX was answering correctly.

Both PUB and SUB were able to ping each other (from the OS maint window) but the PUB was stuck in Split Brain Recovery state and couldn't see the sub. The SUB was answering calls but MTA and Notifier services were stopped on both pub & sub. Neither pub nor sub could "see" each other.

TAC had us restart the PUB from the CLI, which eventually resolved the problem.

My question is why did we experience problems locally? Why did losing the primary path to the CUCX sub (remote site) cause an initial slowdown of MTA and Notifier services on the local pub? The why, after the network was restored did MTA and Notifier services stop altogether, requiring manual intervention? With this experience behind us it would appear our voice mail is not as redundant as our VAR led us to believe.

Comments & thoughts on this appreciated.
Thanks!!


Original MUG/NAMU Charter Member
 
Anyone?

If Unity Cxn is somehow "dependent" on the integrity of the data connection between the local (in house) pub and the remote (external, 150 miles distant) sub (in a Data Foundry hot site) then what is the purpose of pub/sub redundancy? What we experienced was in essence a single point of failure in connectivity between sites (devices) that brought our VM to its knees.

I know why the link went down, what I'm struggling with is why did Unity Cxn fail?



Original MUG/NAMU Charter Member
 
The supported round trip delay for two unity connection server across a wan link is 20ms max. When yoy don't meet that for that long of a time during your outage all kinds of strange things can happen, including what you experience. 2500ms delay is way too long for any kind of voice support and functionality so there's your explanation.
Also depending on your set up the calls might still been answered at the remote site since it was still up even though slow.
 
Thanks! They were being answered, but MWI was painfully slow, measured in quarter-hours :) Obviously the redundant path needs to be the same speed as the primary, i.e., gig, which coincidentally we just got the 2nd gig path added today, but now the IT folks look upon that 2nd gig path as more bandwidth and so want to flood both paths with cubic volumes of constant ongoing backup traffic to offsite storage. Seems bandwidth begets abuse thereof, which begets more bandwidth followed by even more abuse.
Vicious circle.

sigh..

Original MUG/NAMU Charter Member
 
As long as your round trip delay doesn't suffer you are ok with using the 2nd link for backup traffic. Unless you are a 24-7 operation they should be doing all that off hours when usage is low and users are home.....
 
We are 24/7 (major utility co)

Original MUG/NAMU Charter Member
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top