Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

IBM DIRETOR TWGSERVER Error

Status
Not open for further replies.

curran

Technical User
Dec 4, 2003
16
0
0
Since trying to do an inventory of my servers that did not complete successufully I receive the following error in the server application log when trying to start Director server . They syptoms are as follows director Server service start and enters a started state for about 10 minutes. I view director processes through task manager and the following director service TSGSRVW.exe puts the CPU at 25 % consistently and begins to chewup memeory until windows kills the process wiht the follwoing error occurrs in the Application LOG.
Event Type: Error
Event Source: TWGServer
Event Category: Java Exception
Event ID: 100
Date: 01/10/2006
Time: 2:25:12 PM
User: N/A
Computer: XXXXXXXXXXX
Description:
Unhandled JVM Exception
Application: 'BladeServerMgtTask.0 (TWGVersion 4.20 2004-07-13 build id 511)'
Thread Name: MPA-ThreadBank <5 of 12>
Thread Group: BladeServerMgtTask.0
Exception Type: java.lang.OutOfMemoryError
Stack Trace: java.lang.OutOfMemoryError
Logical Debug Stack:
Calling element.process(QueueElement.QTYPE_BATCH)
com.ibm.sysmgt.app.mpa.server.SPQueryCmd@5003f3a8

The error is always related to Blademanagement server task , we do not have any blade managment servers in our environment. I have traced with ethereal but really see nothing other than the managed systems trying to contact Director server via 14427 the director port. The only time I can keep the director server service started is when I disable network cards. Once cards are enabled the TWGSRVW.exe process begins to climb again and eventually stops. I'm also having an issue connecting to my db2 database Keep getting SQL109N error " id does not have authority to complete requested command" This occurs when I try a reinstall of Director Server . DB2 folks ensure me that the ID has rights and I can connect to the db through db2 command prompt. the Db is local to the director server as well

Director server ver 4.20
xseries 255
2x 2.0Ghx xeon processors
1.5 gb ram

managed servers 1100
xseries 225 servers running 4.20 agents
1x 3.0ghz xeon processor
1 gb of memory
RSA II cards

Both Management and managed servers are Running Windows 2003 Standard edition Server no servicpack

Director Managment server is running DB2 Enterprise Editon 8.1.8

Just as a side note this is my Secondary Director server the Primary has the exact same config and is working wonderfully.
I have tried a TWGsave and Twgrestore from the primary to one with problems but same issues occur.

Any help would be greatly appreciated!!!!
 
Did you manage to resolve this problem? I'm having the same thing here and it's driving me crazy!!
 
I think I have gotten to the bottom of it. Do you have have any action event plans that alert on Secutiry or any events from the Windows event.logs??? If so get rid of them. what happend in my case was that I was running a couple fo different event action plans for Security Logs size and for some specfic events I wanted to be notified on from the windows logs. The silly thing is that instead of agent server parsing events and sending only the ones you want it sends the whole log. IF you server is down for any reason the agents store this info until they see that the server is back once the server is back they try and send everything they have backlogged. In anycase I had 1200 servers doing this and my server could never keep up.
I deleted all of the event action plans but still had the problem of Servers sending backlogged events. The only way I was ever able to get my server backup was to change the iP address and manually readd all mangaged nodes. If you don't have a large environment you could also do the following to try and clear the archived events on the the managed nodes.

Form a Commnad prompt type
Net stop twgipc (stops director communications)
Twgreset (clears Mangament server references from the managed node. Your server may be greyed out for a bit in the management console , But should be discoverd again via presence check after doing net start twgipc. Test this first.)
Net start twgipc resumes director communictions

Test this in a lab or on a test box first.

Hope this helps, In my case I had to manually readd servers because I have 1200 across canada and strict Change management . So I need to build a software package to perform the steps I described to you above . I then need to have it distributed to all 1200 server and have it executed to clear all the archived events.


 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top