Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Multithreaded application help

Status
Not open for further replies.

ns1234

Programmer
Dec 3, 2002
6
Hi All

I need suggestions on improving the design of my multithreaded application more effectively. The required as-is scenario is as follows:

We have a data processing Java console application which is processing data from N- channels. The data is being downloaded from N number of channels (E.G a folder on the local systems or LAN, COM port, Email, FTP). Data precisely consists of 2-3KB of text files or mage files (upto 2 MB). Now there can be N number of such channels i.e there permutations and combinations (2 hotfolders, 3 COM ports, 3 FTP servers…etc etc). Data is being processed for each channel according to some filters and rules applied to their contents and stored at particular destination folders specified. Each Channel can have N Number of files. I’m using the CoR(Chain of responsibility) pattern for this.

Currently the application that we have for this doing the following:

1. A configuring client GUI application is used to configure channels, basically info about the channel FTP server and user-id,passwd,…filter on the channel i.e filter1, filter2, and rule to apply like ToUpper…etc. It creates an XML file for each active channel. Every channel is given a system generated name is ch1, ch2…etc when is is configured by the client. This app is used once to configure the server.

The Server console consists of:

1. One ChannelManager thread runs and checks the number of channels present.(basically XML files (ch1.xml, ch2.xml..etc)

2. This ChannelManager spawns the required number of ChannelProc threads for each channel in a loop to download data to their specific (ch1,ch2…etc) channel folder in a RAW folder on a local drive eg Raw /ch1, Raw/ch2

3. Another Thread called FilterManager polls the RAW folder to find the number of channel folders to process and spawns that many number of FilterProcessor threads to parse data based on the filter and moves the data to individual folders (ch1,ch2…etc) in a FILTER folder.ie Filter/ch1, Filter/ch2 on local drive

4. Another Thread called Filter2Manager polls the FILTER folder to find the number of channel folders to process and spawns that many number of Filter2Processor threads to parse data based on the filter2 and moves the data to individual folders (ch1,ch2…etc) in a FILTER2 folder.ie Filter2/ch1, Filter2/ch2.
5. Another Thread called RuleManager polls the FILTER2 folder to check the number of channel folders to process and for each channel spawns a RuleProcessor thread to convert the data in each channel to the specific output (eg all uppercase, lowercase..etc) and moves the data to individual folders (ch1,ch2…etc) in a RULE folder.i.e Rule1/ch1, Rule2/ch1

So basically the funda is there are 4 polling Manager threads which are in turn spawning that many number of processor threads to process data. Since downloading of data can be scheduled by the user so the ChannelProc thread for a particular channel keeps polling every 1 min (min time interval) and checks against the configured timing interval to download data. Other processor threads (Filterprocessor, Filterprocessor2..) die when processing of a particular file is complete and data moved to the other successive folder. One channel can have N Number of files/data at any given point of time. All rules are read and applied using XML files. For now the Max number of processor threads have been fixed to 3 for FilterProcessor, Filter2Processor, RuleProcessor threads, since there are so many threads running the Memory and CPU usage goes up.

THE PROBLEM:
The problem is data is coming continuously from COM ports so those threads will never be free. The more the number of COM ports the more the data and the more the number of threads, so much so that other threads don’t get a chance to execute, since max number of FilterProcessor, Filter2Processor, RuleProcessor threads have been fixed. So if there are 2 COM channels getting in data continuously and LAN Folder or FTP server to download data from, the COM threads are always occupied and the third channel is never processed. How do I utilize the sleep time between Channels polling by channleproc threads to process other unprocessed channels.

I need to re-design this system so that there is not data loss and yet all channels get processed. ANY SUGGESTIONS, TIPS, Schemes, Patterns.
I was thinking of using TimerTask threads for Filters processing…but if I use one Timer to process all channels then Time slicing will go up as channels increase. Also if one Filter processing task fails the others will also stop. If I spawn a new Timer for each channel then again the number of threads goes up.
Thanks a lot
 
First, are you using BufferedReaders? If so, the non-COM threads should be able to process without you losing any data from the COM channels.

Second, if you have a tight loop (while data { }) within your COM threads, make sure to put a Thread.yield() or Thread.sleep(x) statement somewhere in there. That'll ensure that the other threads get a chance to execute.

Third, set the priority of the COM threads to one or two notches higher than the others. That way, the scheduler will wake them up more often, which would seem to be what you're looking for.

HTH
 
Thanks for the response pullingteeth! I'm using BufferedReaders and the new java I/O APIs (FileChannel) for moving data from one location to the other. I'm also using Thread.sleep(x) so that the other threads get a chance to run, but still the performance goes over a period of time. Right now all the Channel threads are set at equal priority and still the COM threads hog most of the CPU cycles. If I increase the priority higher the others will have even lesser chances of running. This application is to be designed for continuous 24 x7 running. But under the current scenario I get and OutOfMemoryException after 2days of continous running. Could you suggest how to control that?

Thanks
 
I'm just getting back into Java so my guess would probably be as good as yours, given that you seem to have covered the essentials. However, to catch the memory leak, you might want to attach a profiler to the application; if you understand where the memory leak is coming from, you may find the thread hogging issue.

Also, you might want to consider a different architecture; what about an RMI based application, where the ChannelManager is its own application, and each thread is also its own application? It could just be that your program is too resource-intensive for the server its on; by splitting it using RMI (or similar) you can spread the load over N machines.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top