Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Reading Files

Status
Not open for further replies.

logic2fun

IS-IT--Management
Mar 19, 2008
8
US
Hi All,
I have a situation where we are planning to migrate from Stand Alone Applications to Multi threading processing.

Situation is on a daily basis we read approximately a million files that comprise of a billion records. Today there is a stand alone application/applications(they run on different server where we distribute the files to multiple servers). The stand alone application just read these files and parses them and transforms them and creates a LOAD ready LOAD file for Database.

Would it be any benefit to move to MULTI THREADING where we will be processing files simultaneously using multiple threads or will we be limited by the I/O channel/buffer limitation and see no improvement better than single threaded.

I dont expect to see a four fold growth but was anticipating atleast a 2 fold growth using 4 threads and doing parallel processing.

BTW we run on QUAD Core processors with Red Hat Enterprise Linux. The file sizes are approximately One Meg with approximately 1000 records in it.

Any Inputs ??

Thanks
 
If the files were spread across multiple hard drives then it would definitely help to have one thread per hard drive. If it's just one RAID volume, I'm not sure if it would help much to multithread.
 
Try to simulate proposed architecture in multi-programming envirinment (start 2-3-4 present applications on the server). If a summary through-put grows, you have a chance to win with multi-threading. If not - it's unclear situation.
As usually, simple data file conversion is not processor-bounded (most probably i/o bounded - bad news for multi-threading;). if so, pay attention to asynch i/o, max file buffering and disk channels optimization.
 
Thanks for the response i will try to simulate with multiple processes and see how goes.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top