Soft Errors - Ignore them? 2

clk430 · Nov 30, 2005

So, in our production environment on 6.7, we get many soft errors logged on the Management Console. By soft, I mean error code 12 (source not avail.), however busines processes just fine and when investigated, those sources are really found. Furhthuremore, those errors are more abundant in the History/Function Failures as well. However some are -550 errors.

Can all these errors eventually build up and crash Mercator or casue GPFs with 6.7?

janhes · Dec 1, 2005

BocaBurger · Dec 1, 2005

If you are getting source not available and the data seems to be processed, it is likely you are getting a double trigger on some files. You may need to increase the triggertime in the mercator.ini to 2 or 3 and to add the same delay in the maps with this problem. Every error or warning takes about 52 bytes of RAM, and will eventually cause an out of memory error. It is better to fix the cause of the errors. BTW, is it 6.7.0 or 6.7.1?

BocaBurger
<===========================||////////////////|0
The pen is mightier than the sword, but the sword hurts more!

clk430 · Dec 1, 2005

We are using 6.7. We just increased our RAM to 2GB, but we still often crash when we get one or two GPF's.

The trigger times in the ini file was set to 3 and the map dealy was set to 3000ms as well, but to no avail. Network issue you think?

janhes · Dec 1, 2005

It's not the RAM on the server that's the issue but the memory available to the event server. If you remove the source of the errors you should remove thr GPFs.

eyetry · Dec 1, 2005

Be interesting to see what's actually happening with the resources on the server.

Can you have someone turn on logs that monitor performance on the server? It should also track what processes are running on the box with their CPU and Memory use. Might also have the ability to monitor disk free space.

Depending on the results you may find that you have plenty of resources. If that's the case you ether have map conflicts, tool configuration issues or....

We have a service running against all of our servers that monitors this type of thing. It turned out that the messages in the DSTX log files and GPF files were misleading. We didn't have resource issues we had address issues. Its been a while but I think we ran Windows SDK rebind against the Mercator6.7 (6.71) path and our number of GPFs went considerably.

Via the monitor. we also found that occasionally, on weekends, we were having a problem with scheduled server maintinance. One of the Windows services that was performing disk maint wasn't releasing resources when it was done.

BocaBurger · Dec 1, 2005

If you are on 6.7.0 (268), time to think about upgrading.

BocaBurger
<===========================||////////////////|0
The pen is mightier than the sword, but the sword hurts more!

clk430 · Dec 1, 2005

Eyetry,

Our 1 CPU (2.8ghz) with 2gb of RAM is MAXED out at an average of 95% all the time...so putting any perforamnce monitoring on isn't an option, as we are maxed out. We process about 10,000 files a day, the majority are simple FTP gets/puts and archiving, no translation at all.

We've moddified the heck out of the .ini file (maxthreads, watchmaxthreads, triggertime etc.) and some transactions still 'hang' without erros, forcing restarts of the prod server multiple times a day. Every two days, we get GPF errors and are forced to resart.

Could you please explain your address issues that you faced and fixed in more detail?
We're upgrading servers with more power and procs, but like BocaBurger said, we might just have to upgrade.

clk430 · Dec 1, 2005

OK, get this: I changed the TriggerTime to 30 seconds in the .ini file and guess what? 50% of the "source not availabe" soft errors went away and the CPU usage went down.

??????

What are some negative ramifications from changing the global TriggerTime so hi? Do maps stay open for 30 seconds or do they stay in a pending state? Which uses more resources?

eyetry · Dec 1, 2005

Sounds like your issue is resolved but thought I'd respond to your question as best I can anyway. We're pretty small so we aren't doing as much as others do with TX.....Keep in mind that we limit TX to mapping data to/from corporate friendly data formats. Simply middleware. passing information to stored procedures, mainframe and populating DB tables.....

We started having problems executing certain APIs. Things ran fine on our new server for 6 months. Then, we started having problems with maps that executed FTPs, UnZips etc.... but only when the files exceded a certain size. Something over 300mb. The error messages indicated lack of resources. We started tracking them.... No unexpected spikes in CPU, Memory or disk usage.

Also, we'd always handled files that size and larger on the old server which ran 6.0 thru 6.7.1. The old box was 1.2 ghz with 2gb memory and 200GB dedicated disk. The new box is 3.nGHZ with 4GB mem and 40gb of dedicated disk (moved archiving to our NAS).

Now, we don't process 10000 transactions/files per day like you do. But, we do use one, 1 cpu Windows box to process all inbound enrollment, electronic claims, 27X real + batch queries/responses; out bound 835, enrollment, payment files; Commerce Manager, etc..... DSTX is leveraging DB, http, ftp, zips, emails and so on....

Anyway, we probably only average 1.5gb of TX input per day. Inbound eligbility files average 15-20mb(the range from 300kb to 70mb with occational 200+mb file). 300-500 http transactions daily.....

Our box rarely excedes 50% CPU/memory for more than a few minutes and was very low, below 15%, when our 'out of resource errors' occurred.

We ran the SDK's rebind option against the TX dlls in the TX root directory and the errors stopped. Can now handle files in excess of 1gb without issue. Haven't tested beyond that as realistically 300mb is about as large as our input files get and at that its maybe once every 6 months.

Someone smarter than me can explain how it works. I think it assigns specific memory addresses to dlls in the TX directory. Our nework Services area said there's a potential down side to having run the rebind and the MS articls I read on rebinding left me thinking we might be waisting our time but.... so far so good.

blah blah blah....

janhes · Dec 2, 2005

If changing trigger time has had such an effect you may find you have not been processing your ftp files correctly.
The problem wuth ftp files is that if you trigger on the file name sent, the map will trigger as soon af the transfer of the file starts as it creates the file name immediately. Upping the trigger time means that the map will wait 30 secs before it starts processing the file which still may not be complete.
The best solution is for the sender to give the file a .tmp extension and then rename it at the end of the ftp. That way you know the file is complete when you start processing.
Alternatively they could send an empty file after the data file and you could trigger on that.

clk430 · Dec 2, 2005

Eyetry, thanks so much. I'll get our windows guy to look into this and see what he came come up with. This instabiltiy has to be casued by our OS and our prehistroic server.

Janhes, I see exactly what you are saying, and our application areas are doing this, however, we are running into the problem in multimap systems.

Map1 -Merc picks up file - renames and archives
Map2 -Pick up newly named file - sends to batch script for encription
Map3 -pick up encrypted file and FTP to TP's server.

Map3 get's the error 'source not available. Shouldn't map 2 write the file and relaese with when it has a timedatestamp, or does Map3 pick up on beginning and end?

BocaBurger · Dec 2, 2005

janhes is 100% correct. He gets a star.

BocaBurger
<===========================||////////////////|0
The pen is mightier than the sword, but the sword hurts more!

clk430 · Dec 2, 2005

And I will give a star to anyone who can answer why there is a correlation between trigger time and lower CPU usage!

Seriously though, does anybody know?

BocaBurger · Dec 2, 2005

If you are triggering a map twice, thus two map are running, you double the CPU usage, per map (if the map is multi-threaded). If not, you have an init pending, that takes just a bit of CPU. If you get a triple trigger, well, you can see what could happen.

BocaBurger
<===========================||////////////////|0
The pen is mightier than the sword, but the sword hurts more!

janhes · Dec 5, 2005

Does the batch process that does the encryption create the output file immediately and before the encryption is complete? If so you may have the same problem as with ftp.

mlapse · Dec 7, 2005

If you are using mercator to do a put of a file you shouldnt have a problem, because mercator automatically creates the temporary files and then renames them after the put.

BocaBurger · Dec 8, 2005

UNLESS you are using PUT. Card targets are created this way.

BocaBurger
<===========================||////////////////|0
The pen is mightier than the sword, but the sword hurts more!

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Soft Errors - Ignore them? 2

Programmer

Technical User

Vendor

Programmer

Technical User

Programmer

Vendor

Programmer

Programmer

Programmer

Technical User

Programmer

Vendor

Programmer

Vendor

Technical User

IS-IT--Management

Vendor

Similar threads

Log in

Part and Inventory Search

Sponsor