MLSM going down 1

Jaimegu · Jan 25, 2005

Hi there!
We have frecuent problems with MLSM, which gets down and sometimes it also affects TFE (those times we need to startup or reboot SCCS).
The message is:

41505 Major System Manager Service Service MLSM_Service terminated unexpectedly, rc=243

The problem is more frecuent in high traffic.
Our platform is:
Meridian1 81C FNF Succession 3.0 patched last week.
SCCS 4.2 SU13
Genesys Tserver 7.0 (same problem with 6.5)
(Not all the CDNs are in Genesys)

We have worked with Genesys and Nortel tier3. They suspected some problems in transfers over genesys adquiered sets (IVR and softphone). We addressed them but we still have the problem.
Do you have any clues?

TheWorkingMan · Jan 25, 2005

Make sure that the CSQI and CSQO in LD 17 are set to 255. You can print it in LD 22. REQ= PRT, TYPE= PARM.... I would also take link traces to see what is going on. I am sure that Nortel has done this???? Prt out your PARM and see if it may be something simple?!!

TWM

Jaimegu · Jan 26, 2005

We got a patch to increase those buffers in Meridian over 255 4 months ago.
Both parameters are set to 2000 (they were 2 weeks ago in 4000 and modified by Nortel).
They also modified NCR to 15000 (from 32000) and MGCR to 25 (from 720)

TheWorkingMan · Jan 26, 2005

You could have a NIC going bad??? Have you ran a sniffer trace to look at problems like that??

TWM

lopezdba · Jan 27, 2005

Check to see if there is anything else on the ELAN other than the PBX and your SCCS server. All third party applications should be the CLAN ie; Genesys, Symon, Nice. Any added traffic on the ELAN can cause this problem.

Jaimegu · Jan 27, 2005

Well:
1. ELAN is on a cisco sw. with 3 patch cords (2 CPUs and 1 SCCS). No uplinks. All ports are set to 10Mbps Full duplex. So SCCS ELAN NIC does.
2. I don't think is a NIC failing because the problem occurs in 2 different servers.
On the CLAN we run MRTG, and we will put a sniffer just for this problem. We also use perfomance monitoring (Windows- administrative tools)in both NICs and we find very low traffic. I will change the port in the switch.
Any other idea?

sandyml · Jan 27, 2005

The ELAN needs to be 10Mbps Half Duplex. You might try correcting those settings.

kirk10 · Jan 27, 2005

we have the same problem at our site.
Only with us it is the elan and not the MLSM servive witch
goes down.
We can't resolve the problem also.
Did you replace the kabels, the netwerk switch, the trancivers...
We did all this things and it didn't help.
Did you also make a Link Trace and a AML trace.

Can you keep me up speed if you find a solution for this problem.
And if we have a solution I will mail you....

Greetings Marco

lopezdba · Jan 31, 2005

Marco,
Check the settings on the ELAN NIC card and make sure it's set for 10 base-T at half duplex on the server, also make sure there is nothing else connected to the ELAN other than your PBX and the SCCS server.

kirk10 · Jan 31, 2005

We did that all ready, we thing the problem is the CCA StatServer from Genesys.
If we stop this the ELAN stays enabeld.
If we start the StaServer the ELAN clears his treshold.
We are now investigating where the problem is.
I let you now

TheWorkingMan · Jan 31, 2005

I had a patch in a M1 that caused this problem. The patch was on a 25.40b system. I think that the patch was for agents getting calls while in not ready. I will try to find the MPLR number.

TWM

Jaimegu · Feb 1, 2005

Sandyml:
I was told Yesterday that CPP (Pentium II) has Fast Ethernet (100 Mbps Full duplex).

Do you have a different idea?

TheWorkingMan · Feb 2, 2005

The CPP will run 100/full, this is true. Make sure that the ports are set this way. If you are going thru a crisco switch this is very important. Make sure the switch ports are set for 100/full not auto-neg. That will cause issues.

TWM

MarkDv · Feb 9, 2005

I solved a TFE problem 2 weeks ago by bumping up the Calls per hour in Stat Config> Hist Stats> Parameters. BEFORE you change this calcuate your required database so as not to exceed youe actual. You may need to play with some other statistics. It solved our TFE high traffic problem.

Mark Dvoracek
Lowes Companies Inc.
Formerly Sprint, Williams, Wiltel, Pac Tel, ect....

TheWorkingMan · Feb 10, 2005

Mark,

Did you solve the problem? How did you find it?

Jaimegu · Apr 18, 2005

Well, It's a good practice to share the solutions:
After this huge headache: Here is the process:
1. They suggested changes in scripting to reduce SCCS operations. (no effects at all)
2. Some changes in Genesys: Upgrade to 7.0 (no effects), changes in softphone and IVR programs to increase time between transfer tries (Cleaner MLSM logs but same issues).
3. Restarting Switches (stopped MLSM problems but evidenced ELAN problems - Switch link down).
4. Nortel applied some patches on Meridian and the last one, written for our PBX, worked. I don't have the description but it is related to syncronize communication. chgtim_1.cpp

sandyml · Apr 18, 2005

Many thanks for your follow-up on resolution, Jaimegu. You are very right, it is good practice to share solutions. I for one pay particular attention. Thanks again.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

MLSM going down 1

Jaimegu

MIS

TheWorkingMan

Vendor

Jaimegu

MIS

TheWorkingMan

Vendor

lopezdba

Vendor

Jaimegu

MIS

sandyml

Vendor

kirk10

Programmer

lopezdba

Vendor

kirk10

Programmer

TheWorkingMan

Vendor

Jaimegu

MIS

TheWorkingMan

Vendor

MarkDv

Programmer

TheWorkingMan

Vendor

Jaimegu

MIS

sandyml

Vendor

Similar threads

Part and Inventory Search

Sponsor