Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Most efficient way of handling tempdb

Status
Not open for further replies.

collierd2

MIS
Feb 26, 2007
61
0
0
DE
Hello

I am having some issues with tempdb.
Basically, it's growing beyond our disc drive space.

I know, there's one piece of code causing it

I appreciate records can be inserted a certain number at a time and that would keep this to a minumum although, I'm a little unsure of what to do with the following:

I have a piece of code
The code runs as a transform task within a DTS package
It's simply code and not a stored procedure

The code comprises of 8 unions
Each of these are a reasonable size (around 1m rows)
The 8 pieces of sub code are then grouped and aggregated by a piece of code that surrounds them

e.g.

Code:
select x, y, z, sum(sales), sum(cost) from
(select x, y, z, sales, cost from a
union all
select x, y, z, sales, cost from b
etc.
) a
group by x, y, z

Note, this is a very simple interpretation, in reality joins and tables differ but the union is possible and works


Initially, I also had grouping and summing on the 8 steps
I removed this
Are there any steps I could take to improve it

Write each to a file and then group/aggregate to another file?
Insert x rows at a time - how if I am aggregating?
Just accept it - it's as good as it gets?

Anything else?


Any advice would be very much appreciated

Damian.
 
Depending upon your needs with your data, what about using UNION rather than UNION ALL, since UNION ALL will likely end up returning dups... and since you're dealing with millions of records, then you could conceivably have millions of duplicates.
 
If you use UNION instead of UNION ALL you will get wrong results of the aggregate function.
UNION ALL is faster, because there is no need to filter records.
How about:
Code:
select x, y, z, sum(sales), sum(cost) from
(select x, y, z, SUM(sales) AS Sales,
                 SUM(cost)  AS Costs
        from a
        group by x, y, z
  union all
  select x, y, z, SUM(sales) AS Sales,
                  SUM(cost)  AS Costs
         from b
  group by x, y, z
) a
group by x, y, z

Borislav Borissov
VFP9 SP2, SQL Server 2000,2005 & 2008.
 
Won't grouping then grouping again have a negative impact on both performance and tempdb?
 
I believe the way suggested by Boris should be better for performance, but you can only tell for sure if you perform a comparison using SET STATISTICS IO ON before both queries.

PluralSight Learning Library
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top