Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

parallelism and multicore processors

Status
Not open for further replies.

davhas35

Programmer
Jul 31, 2008
5
I have a sequence container in a control flow running on a server with 2 quad core processors and 32 gig of ram. In the sequence container there are 2 dataflow tasks each with a workflow replicated 4 times, incidentally each workflow has a script component that has a reference to a com dll using interop.

So in all I have 8 workflows contained in 2 dataflows residing in 1 sequence container. I was expecting to see all 8 cores being used, instead what I see is one processor at near 100% utilization and the other 7 minimally being used. Is there a way to leverage all the cores in a multicore processor for SSIS?
 
increase your engine threads within the dataflow to be equal to the total number of tasks in that dataflow.

Paul
---------------------------------------
Shoot Me! Shoot Me NOW!!!
- Daffy Duck
 
Thanks Paul,

But by default they are set to five I increased this to 10 which is the number of processors +2. Still running predominately on one core.
 
for testing purposes remove the script that is calling the com component and see if you get a change.

Paul
---------------------------------------
Shoot Me! Shoot Me NOW!!!
- Daffy Duck
 
All processors were used when I took the script task out. Which I find kind of odd since I thought each workflow executed in its own thread.
 
the issue maybe the reaching out to the com object and if it supports multithreading. I have numerous packages with script tasks that run in parallel.

try the union all trick and see what happens. Prior to your script task add a union all task.

Paul
---------------------------------------
Shoot Me! Shoot Me NOW!!!
- Daffy Duck
 
Paul,

Thanks again for the reply, I tried the union all trick and still am seeing the same thing.

The package runs and I get the expected results but I have 68 million records to run through it and at 850 to 900k an hour I feel like I am back to the future and it is 1996 again.

But then again it only has to run once every 3 mos or so...

Still I read how SSIS should be able to load 1TB of data an hour, an quite frankly I have never seen anything approaching that speed, even when going straight from a flat file to a dB. I would be satisfied with 10 gig at this point.

but I digress..

thanks again for you help.
 
it can load massive amounts of data fast but it all depends on how you design your packages.

Why do you have multiple sources and destinations in a single dataflow? if these processes can run in parallel put them in to seperate dataflows (you can copy/cut paste items between dataflows). Here you can assign multiple execution threads at threads allowing each dataflow to run in parallel. Also you may want to try some buffer tuning to make sure you are optimizing the number of records you push through the data flow at once. If after you break the processes out into seperate dataflows you still see the same behavior then I will lay money on the bottle neck being the com object.

What exactly does this object do and can you embed it all into a script task without having to make the external call to the object?

Paul
---------------------------------------
Shoot Me! Shoot Me NOW!!!
- Daffy Duck
 
Paul,

I tried splitting them out at one point but may give it another go, I doubt it is the COM object itself as I have .NET app that uses essentially the same call accross multiple threads and it seems to perform well, although in truth I havent benchmarked it. I have tuned the buffers previously.

The COM object is used to standardize data, specifically addresses and parse them into housenumber, streetnumber etc. I suppose I could use a script task or write a custom task using .NET and workerthreads.

David

 
The fact that you previously tested the script task by pulling it out and then running the process and which resulted in the process utilizing multiple cores is a further indicator that this is where your issue is.

Paul
---------------------------------------
Shoot Me! Shoot Me NOW!!!
- Daffy Duck
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top