Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

DB Connection hangs up Mercator

Status
Not open for further replies.

mladd

Programmer
Apr 12, 2006
46
Ok, this is a problem that we have been having for some time now and have not been able to find the reason. Partly becauase the DBA is of little help since it's not a "database problem" and there are no errors reported.

Once in a while (almost daily now) we seem to get a DB connection that does not release or is closed by the database, and Mercator doesn't know it was or should be closed. The connection just sits there. The map just sits there trying to process stuck on the same card that is running a query or performing an update. Some of these maps typically take seconds to run, but 30 or 40 minutes later we realize it is still trying to process. The DBA still sees the connection to the DB, and he kills it. Sometimes this works. Other times...most of the time...the server needs to be restarted.

The map that this happens in is not always the same, nor does it happen at the same time, or even the same DB Name.

I have been playing with some of the INI settings. Could someone give me some good tips on the DB settings in the INI for Sybase? Specifically The idle/keep/slim/hlim etc...

I am running 6.7.1 on a WinNT server with 2 gigs of Ram and dual 2.8 ghz Xeons. The issues are when they are connecting to Sybase 12. The correct client version is also installed on the server that Mercator runs on.

We have about 250+ watches in 20 msl's running on this event server.

Primarily I just want to stop getting called at 2 in the morning because a client didn't get a file from us. :)

Thanks,
Mike
 
Hi Mike,

This happens to a lot of people - don't know why settings aren't defaulted into the dstx.ini / mercator.ini!

here are our settings (for Oracle, but should work for you too):

IdleDB=120
SLimDB=20
;HLimDB=0
;KeepDB=0
MinDB=10

Cheers,
Olly.
 
We were recommended the following settings by Support and have never had any problems.

IdleDB=10
;SLimDB=0
;HLimDB=0
KeepDB=5
;MinDB=0
 
This is what we ended up with in 6.0 and have carried it over thru 6.5.2, 6.7.1 and 8.N betas and testing 8.1

IdleDB=120
SLimDB=45
HLimDB=60
KeepDB=0
;MinDB=0

Like Ollyc we are Oracle but should work... seems like I worked with our DBA and Mercator(at the time). You really need to play with it because values are based on your environ.... we run anywhere from 300 - 600 real-time query/responses thru with an additional 30-40 in & outbounds transactions thru the event server per day. So our volume sounds like it could be lighter.

Ollyc and janhes (and Boca if he's still around) would probably know more about this. I will defer to any corrections they might state. My experience, however, indicates that bad connections aren't dropped/closed until they idle out. Therefore there’s a possibility that your system could try reusing a bad idle connection. So, you need to find a balance between the efficiency of keeping connections open and closing them frequently enough to kill the bad ones. Depending on how your transactions are setup you might also experiment with 'retry'.


 
Thanks guys.

As far as volume, we are running about 200,000 MQ maps per day. On top of that we have a few hundred reports/files that we are creating and recieving from clients. about 95% of all of our maps connect to the databse in some way shape or form.

I have tried the retry options, but that seemed to create a little more overhead. It's possible that I had too much time in between retries, too many retries, and possibly in the wrong places (Map instead of card) for some of them. I'll keep playing with the settings.

Thanks.
Mike
 
Try reducing IdleDB.

Is there a volume trend? ie... at peak times how may MQs is it possible to run? In addition to back connections you may have two few connections avail.... what is your HLim set to?
 
I'm sure we had an issue like this sometime ago, try setting KeepDB=-1 and a low IdleDB=1 to see if the problem still happens.
 
The manual says

Keep
Specify how many seconds an idle connection is expected to remain valid. Default is 0, which disables connection testing. A value of -1 will force a test every time, but this is not recommended.

This may have serious overheads if a new connection is forced for every database connection.

I would suggest a low value for Keep and idle. Also get your DBA to check what is causing the connection breaks. We have had situations where we have had too many connections open so attempted new ones fail or various other transient database problems. Most were solved at the database end rather than DSTX.
I have found with Oracle that the event server won't close cleanly is a connection is open (I don't have this problem now with our ini file settings and have around 300,000 connections per day with 90% reused). The only solution is to abandon the task running ntsvns.exe close the event server at the windows level and restart it.
 
Hi Janhes

I was just suggesting something to try to see if he still gets the error, sorry to step on your toes.
 
Jinkys,
I tend to agree with Janhes, but I do see your point for "testing". I may try that to see if forcing the connections to test every time changes what is going on. I may try that for a day and see what happens.

And it appears to happen early in the morning, with lower volumes of orders. At night there are hundreds of orders an hour, but during the day, there are thousands of orders an hour, or more, flowing through the Event Server. We are avg about 45,000 per day for this one client. But during the day all other maps for other clients see the same increase (not the actual numbers of orders, but the percentage increase).

I've tried working with the DBA to find out what else is happening at that time (1am - 3am, typically) and all I get back is "No errors found".

It's a little frustrating when it seems that if no one else sees an error then it is strictly a Datastage issue and we have to find the source of the problems and tell them what needs to be fixed.

Thanks Guys for all your help on this!

-Mike
 
Again, it sounds like your keeping bad connections open or you don't have enough avail.

whats your HlimDB set to?

can you determine (perhaps with the DBA) what the demand has been in relation to the avail connections?

Part of the failure could be that you have 2000 connection requests but only allow 1000. Is it possible that some of the pending connections would time out?

 
An interesting time of night to get the problem. It could be other activity on the server such as database backups. We have had problems with Oracle backups causing DSTX connections to hang. This was overcome by changing the way the backups were taken - i'm not sure of the detail though.
 
I don't have HLim set, or Min set (whatever the default, if anything).

Demand based on available connections are light, as I explained earlier, the volume is very low at this time. My Max Connections is set to 20, so I only have a max of 20 connections opened to the DB.

Pending Init and Pending Init Max are not set either. I have not been able to figure out exactly how that would affect the volumes that we have. If the server doesn't run for 2 hours in the morning, the Max pending will be in the thousands, but any backlog is picked up and processed within 30 minutes or so, so that 60 minute Pending limit (the default) is never close to being hit.

Thanks again!
 
So, when you say 'Max Connections' you're refering to concurrent threads or maps, right?

What's the management console tell you about 'DB Connection' history?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top