Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Mitel Ops manager error message 2

Status
Not open for further replies.

sion06

Technical User
Jul 26, 2006
6
GB
When in moves, add & changes on Mitel Ops manager I recieve an error message saying"An internal system error has occurred. Please see your system administrator" this happens after I have added the Location and Directory number and pressed changed button or any other option available. It doesn't allow me to search and cannot access any information regarding the customer, can anyone help!
Many Thanks
 
Some questions for you first.

1) What type of system is OpsMan trying to talk to, SX2000 or 3300?

2) What version of OPSMAN?

3) When in the MACs forms specifically which tab are you attempting to use (Add/Change/Move/Swap)?

4) Can you open the Utilities screen (Config/Directory Mgmt/Telephone Directory Utilities? If so, do you see any changes to be propagated?

5) are you able to perform a network synch to the network elements?

6) Do you know how to look at the server application event logs?
 
1) What type of system is OpsMan trying to talk to, SX2000 or 3300?
SX2000

2) What version of OPSMAN?
Build 7.4.1.2

3) When in the MACs forms specifically which tab are you attempting to use (Add/Change/Move/Swap)?
All of them

4) Can you open the Utilities screen (Config/Directory Mgmt/Telephone Directory Utilities? If so, do you see any changes to be propagated?
Yes, I have tried to clear some changes but end up with the same error message

5) are you able to perform a network synch to the network elements?
It attempts, but fails.

6) Do you know how to look at the server application event logs?
Not 100% sure, is it via system management?

 
OK.
First of all I do not recommend people ever use the Utility menu to delete changes that for whatever reason fail. You need to fix them, which means learning why they failed and being able to interpret what OPS is telling you so you can fix them. If you've been in the habit of deleting changes then your SX2000 tel directory is likely considerably out of synch with the directory in Ops Man. Anyway, enough about that for right now. That's not your problem.

Lets try some troubleshooting;
From the main screen in OPSMan go to the Utilities tab (on top, next to Help) and then click on Manage, then click on the SX2000 system you're after and then click on the Launch button. What happens? Can you get logged in to a Maintenance session on the SX2000 via this method? Yes or no.

If you can, then we have connectivity between OPS and the PBX so go ahead and close this window.

But If not... then close this window and click on configuration then scroll down to highlight Network Elements then scroll down and click on the editor. When this screen opens click on either next or previous until you see the name of the Element Name appear in the first window (it will be greyed-out, not to worry).

When you have the right Element name found then click on the settings button. That will give you at least 1 and likely two IP addresses. Write them down. Don't change them.

Now close out of everything we just opened, all the way back to the main OPSMAN window

Use your mouse to click on Start, then click on Run. Inside the "run" box enter the letters CMD and then press the "enter" key. This should open up a black command window and your cursor should be blinking at the end of a short line of text that tells you what subdirectory or folder you're in. This is just info, not important. ANyway, type in the command PING followed by a space and then followed by one of the IP addresses you wrote down above, example, PING 10.192.218.10 followed by the "enter" key. You should see some replies. Do you?

Does the reply say Destination Host Unreachable ??
If it does then you have a problem with your network or connectivity between your network and the PBX.

If you see replies with the word "bytes" and the line ends with something likt TTL=xx then we're able to PING the PBX and we're OK.

Lets stop here for now.

 
Assuming the IP addressing and connectivity checks above were all OK and OPSMAN is actually able to talk to the SX2000, we probably ought to take a look at the Application Event logs on the OpsMan/Enterprise Manager Server.

Either use remote desktop or Tight VNC or else physically go to the server PC and log in to it with your userID and password (the server password, not Ent. Mgr or Ops password)

Then right-click on MY COMPUTER and then click on MANAGE
Next click the little (+) sign next to EVENT VIEWER
Single-left-click on Application. This will open a pane on the right side of the screen. Here we're looking for any and all ERRORS. For every error place your mouse cursor on the word "error" and double click on it to see what it's telling you.
 
Thanks Mitelinmyblood

I have tried all these and I am able to Ping the ops manager with 3ms replying, I have also checked the event viewer and the error s are:
1. "Attempt tp update Host Service Principal Names (SPNs) of the computer object in Active Directory failed. The updated values were '<UNAVAILABLE>' and '<UNAVAILABLE>'.

The following error occurred: there are no more endpoints available from the endpoint mapper


2.Attempt to update the DNS host name of the computer object in Active Directory failed was 'opsmmgr.cardiff.gov.uk'. The following error occurred. There are no more endpoints available from the endpoint mapper.

3. The Printer could not be installed.


These are the only errors that I can see on the event viewer of opsman..

Is there any thing else I could try?
 
Okay...
Back in the OpsMan Configuration, Directory Management, Telephone Directory Utility window try highlighting one of the entries where it shows 1 or more outstanding changes to propagate (by the way, how many are there?)

With this line still highlighted scroll down to the "Additional Information" arrow towards the bottom of this window and click here to open this window. Read at the bottom under reason for pending status to see what OPS is complaining about. If you have several pending changes, you may have several of these to read through. Are the reasons all the same? (maybe we should have looked here first)

Something else to check is to see if all the necessary "services" are running; here's how...
Click on Start then click on windows explorer (or if your keyboard has the handy little "windows" shortcut key just hold it down and press the "E" key to start Windows Explorer.

In Win Explorer, navigate to C:\Program Files\Mitel\Shared\Bin\srmwin.exe and double-click on SRMWIN

This will open a window listing all the running services for OPS and Enterprise Mgr don't change anything
Services that should be started are:

OPS Alarm notification for pager
OPS alarm paging message queue server
OPS alarm server
OPS dba server
OPS initialization
OPS java virtual machine
OPS udt handler
OracleServiceMNMS (this may be stopped or not there)
OracleStartMNMS (this also may be stopped or not there)
Service Recovery Manager
Enterprise Manager Apache
Enterprise Manager MySQL
Enterprise Manager Server

I doubt you'll find your problem here, but it's good to check anyway
 
In the OpsMan Configuration, Directory Management, Telephone Directory Utility window there are 44 outstanding changes all but 1 saying in status column OS. The only different one is saying New..

I did try clearing these yesterday and it did let clear a few until the same error message appeared. Can this be the cause?

The only services that are running are these:

OPS Alarm notification for pager
OPS alarm paging message queue server
OPS alarm server
OPS dba server
OPS java virtual machine
OPS udt handler
Task Scheduler

Does this Help?
 
Just curious, what is the system name of the SX2000? I have had similar problems before and the problem was with the name of the system in the SX2000 System Option form (I think that is where the name is entered). Then you need to put that exact name in the OPSman.

The name needs to have a Capitalized first letter with the rest being lower case. Also, the first character cannot be a number. I don't think you can have spaces in the name either.

If this doesn't fix you up, then please return to your regularly scheduled help from MitelInMyBlood, he knows his stuff!




**************************************
My Biggest problem is that I almost always believe what I tell myself.
 
In re to the running services list, how can you be running opsman 7.4 and not be running Enterprise Manager? Ops 7.4 is fully dependent upon E-manager 3 and (to my knowledge anyway) won't run without it.

More questions:
Are you only attempting to manage one (1) PBX with this system or more than 1? If more than 1, are they clustered?

When the utility screen reports "OS" in the status column then there's something wrong with the phone number entry. Check to be sure there's no conflict here with the ARS table. A phone number managed by OPS as being INTERNAL to the selected element cannot be an "explicit" fully qualified entry in the ARS table (or in the System Speedcall table) For example if you're trying to put extn 4567 in via OPS as an internal number within this element (or node) then 4567 cannot coincidentally be a fully qualified entry in ARS.

Go to the SX2000 maintenance terminal and enter the command LOC N xxxx where xxxx is one of the exts ops man is complaining about. see what it tells you about the number.

Also at the maintenance terminal enter the command DBTAB 29 The table cardinality should be 1 number greater than the number of entries in OpsMan (shown in OPS utility screen) - if you've been deleting things in the OPS utility screen I'm going to take a wild guess these two numbers are not even close. We may have some work to do.
 

By the way, may I ask if you're the customer/end-user of this system or do you represent the telephone service company who installed it?

Have you received any certification training on OpsMan?
 

Yes we are running Enterprise manager, there are 23 PBX's and yes they are clustered.

When entering Loc n XXXX I get a message "The number refers to ARS Leading Digits.

When Enter the Command dbtab 29 the message I recieve is:
29 ($1D) sws_corp_dir[1..20000] Record size: 39
Address: 01c06B64 Bit map pointer: $009EE7D0 Table cardinality: 4670

Yes you are right the two numbers are not even close.

Yes I am the end user/customer and only some training with Mitel..

I have log this call with Mitel they have been out and I have been in contact with them but they do not know how to solve this fault.

Thanks For helping me with this fault..


 
OK. (Argggghhhhhh!)

Any that you check via the LOC N xxxx method which ultimately come back from the PBX stating that the number "refers to ARS leading digits" will need to have the number removed from ARS on every one of the nodes in the cluster where the number appears in ARS. The problem here is OpsMan is trying to manage the number using the Remote Directory table and Cluster Element routing digits of the clustered elements while at the same time one or more of the pBXs in the cluster is trying to manage the same number in ARS. Sorry, but you can't do that in a cluster.

I would suggest trying to fix 1 or 2 of these by removing the number(s) from the ARS table (commit & confirm) then exit (close) the CDE session on the PBX and then trying to perform a Network Synch then look at your OpsMan directory utility (refresh it) again to see of the number of changes to propagate is now reduced by the one or two that you fixed in ARS. If that fixed (or reduced the number of outstanding changes to propagate) then you have your work cut out for you to go through 1 by 1 and fix the others.

I'd suggest focusing on 1 pbx at a time until you get him cleaned up. Then, after hours, perform a full propagation just to that one element or node. With 4700-some directory entries in OpsMan this full propagation just to one node only is going to take awhile. I think the average is about 100 directory entries per minute. Don't panic, give it time to finish.

Following a full propagation you're very apt to discover that now you have a bunch more errors to fix because the full propagation will undo all those heretofore errors that you've previously deleted. (Warning: A full propagate erases the existing telephone directory in the PBX you are propagating to and replaces it with the directory info in OpsMan. Here we're assuming that OpsMan has the correct directory data)

The bad news is you're going to have to eat this elephant one bite at a time. Whoever taught you the trick about deleting errored pending changes out of the Tel Dir Utility screen needs to have their butt kicked. You never delete them; you find 'em and fix 'em before you have accumulated a bunch of them.

Once things are cleaned up (or you think they're cleaned up) on 1 pbx then run the DBTAB 29 command again and compare the table cardinality number to the number of entries reported in OpsMan. The goal is to get DBTAB 29 to within 1 count. Somewhere I read this was "5" but on the switches I manage it's "1". On my switches DBTAB 29 reads 1 number greater than the directory count in OPS)

Something else, if someone has been inputting TelDir entries direct into the clustered PBXs via a CDE session on the PBX, that person also needs to feel the boot a few times. Once you begin managing your corporate directory via OPSMan, you at that point CEASE doing so from a direct CDE session on the PBX.

I think from what we've found out (and given that your support co has no idea about the error) then it may be time to bring in the hired guns from Mitel Professional Services and let them solve this.

Secondly, everyone in your group who shares responsibility for doing the Adds, Moves and Changes should be mandated to to attend formal training on OpsMan System Administrator school.

 
UPDATE:

Two of my techs report seeing same exact error message. Seems to have started in Emgr 3/Ops 7.4

Their "fix" is to reboot the server when it happens. Presently we're doing about 200 MACs per month and we're seeing this happen about once a week.

Obviously a bug of some sort.

 
I have had Mitel come in a try to solve this fault, it has been solved now but I still see errors in the OpsMan Configuration, Directory Management, Telephone Directory Utility where there were 44 before there are now 88 errors there have seemed to double the increased in errors. I shall notitfy the team and present your findings that there must be a bug.

Thanks for your help it is much apprieciated.
 
Don't confuse your issues.

Your telephone directory utility is complaining about errors in telephone directory programing. These are almost 99.99% likely to be human errors in programming the PBX(s) (More likely to be PBX programming errors than OpsMan programming errors) and will have to be (and need to be) corrected one by one, with better understanding gained by the person(s) doing the MACs to understand what it is that was done incorrectly and/or what has been mistakenly hard-coded into some of the system forms (i.e., ARS) of these PBX's that is causing these conflicts w/OpsMan. My guess is it is likely that there are far more than just the 44 (or 88) conflicting entries that are currently showing up. Those displaying right now are just the ones OPSMAN currently knows about (that have not been manually deleted). Deleting these is not how you fix them and only digs the hole deeper.

The other problem, seeing the screen popup message about the System Error is a whole other matter and not at all related.

 
Something else about the "System Error"

(See message above first)

Before doing a wholesale reboot of the server, first try closing down ALL currently open OPSMAN client sessions, then reopen 1 and see if the system error is still occurring. If it is, then try a server reboot.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top