Most efficient way of handling tempdb

collierd2 · Feb 15, 2011

Hello

I am having some issues with tempdb.
Basically, it's growing beyond our disc drive space.

I know, there's one piece of code causing it

I appreciate records can be inserted a certain number at a time and that would keep this to a minumum although, I'm a little unsure of what to do with the following:

I have a piece of code
The code runs as a transform task within a DTS package
It's simply code and not a stored procedure

The code comprises of 8 unions
Each of these are a reasonable size (around 1m rows)
The 8 pieces of sub code are then grouped and aggregated by a piece of code that surrounds them

e.g.

Code:

select x, y, z, sum(sales), sum(cost) from
(select x, y, z, sales, cost from a
union all
select x, y, z, sales, cost from b
etc.
) a
group by x, y, z

Note, this is a very simple interpretation, in reality joins and tables differ but the union is possible and works

Initially, I also had grouping and summing on the 8 steps
I removed this
Are there any steps I could take to improve it

Write each to a file and then group/aggregate to another file?
Insert x rows at a time - how if I am aggregating?
Just accept it - it's as good as it gets?

Anything else?

Any advice would be very much appreciated

Damian.

kjv1611 · Feb 15, 2011

Depending upon your needs with your data, what about using UNION rather than UNION ALL, since UNION ALL will likely end up returning dups... and since you're dealing with millions of records, then you could conceivably have millions of duplicates.

bborissov · Feb 15, 2011

If you use UNION instead of UNION ALL you will get wrong results of the aggregate function.
UNION ALL is faster, because there is no need to filter records.
How about:

Code:

select x, y, z, sum(sales), sum(cost) from
(select x, y, z, SUM(sales) AS Sales,
                 SUM(cost)  AS Costs
        from a
        group by x, y, z
  union all
  select x, y, z, SUM(sales) AS Sales,
                  SUM(cost)  AS Costs
         from b
  group by x, y, z
) a
group by x, y, z

Borislav Borissov
VFP9 SP2, SQL Server 2000,2005 & 2008.

collierd2 · Feb 16, 2011

Won't grouping then grouping again have a negative impact on both performance and tempdb?

markros · Feb 16, 2011

I believe the way suggested by Boris should be better for performance, but you can only tell for sure if you perform a comparison using SET STATISTICS IO ON before both queries.

PluralSight Learning Library

collierd2 · Feb 16, 2011

ok, thanks

I'll give that a go then

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Most efficient way of handling tempdb

collierd2

MIS

kjv1611

Active member

bborissov

Programmer

collierd2

MIS

markros

Programmer

collierd2

MIS

Similar threads

Part and Inventory Search

Sponsor