Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Simple question regarding parallel flows

Status
Not open for further replies.

Goodie666

Programmer
Nov 24, 2008
44
CA
Hi,

I have a question that is going to seem really stupid but I feel I have to ask anyways... Say I have a master dimension package that calls a number of children dimension packages. These packages pool data from my staging tables and do the ETL process to their dedicated final dimension tables. Each of these packages have their own .NET and OLEDB connections.

I've tried to set up my master package to call more than one children package in parallel. They start in parallel but the pooling from teh staging tables, despite the separate connections and tables seems to cause some waiting. It almost seems that my packages wait for one another instead of executing in parallel. I'm just wondering if it's just an impression or if there's something more to it, or rather how I can test this supposition?

Thanks,

Greg
 
This has more to due with SSIS and resource utilization than it does your database connections. SSIS works in memory and if the resources don't exist for it to create the objects in memory it needs then it will lag until the resources are available.

If you look at perfmon and the
Counters
Memory: AvailableMB
SSISDTSPipeline: Total Buffers
SSISDTSPipeline: Buffers Spooled
Physical Disk: Avg. Disk QUE

you will get an idea of what is going on system resource wise while executing your packages.

Also make sure that your data sources and destinations are optimized so your source queries run fast and your inserts happen quickly.
 
Thanks a lot for the hint, I'll look into perfmon counters in more detail. I would agree with you though, after a closed a few apps I had running and restarted my package, the process took a bit less time and the flows really seemed to execute in parallel.

Regards,

Greg
 
A bit more on Parallel execution in SSIS.

In the Control Flow you have a property of MaxConcurrentExecutables this value determines how many control flow items can execute in parallel. This value defaults to -1 which allows the server to determine based upon the systems configuration.

In the data flow you have a property Engine Threads which defaults to 5. This is how many items can run in parallel in your data flow. If you have 10 tasks and yuo have it set to 5 then there is a possibility that the data flows performance could be impacted as the threads have to switch between what tasks currently run. Setting the value to 10 would mean that all tasks would have a thread available to them.

With just this 2 values you can see where firing off a bunch of stuff in parallel can slow the performance of anyone package.

The types of tasks you have in a package and the order in which they are implemented can have a significant impact in addition to your buffer settings.

All of this goes back to my post regarding building packages that run and designing optimized packages that run effeciently.
 
Definitely. I have been playing a bit with these kinds of parameters recently and I've noticed that simply telling SSIS something can make a big difference compared to thinking it can magically discover or figure it out on its own. I actually appreciate this level of control it gives.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top