temp tables

nc297 · Dec 22, 2010

I have two questions...

1. I have this long query where I want to get certain information out of and placed into a temp table. Is this the most efficient way or should I use a CTE?

SELECT n.DOC, ISNULL(pendinfo1, 0) AS PendInFO1, ISNULL(pendinfo2,0) AS pendinfo2, ISNULL(pendinfo3,0) AS pendinfo3,
ISNULL(pendFOover1,0) AS pendFOover1,ISNULL(pendFOover2,0) AS pendFOover2,ISNULL(pendFOover3,0) AS pendFOover3

into #temppd
FROM Natdocfile AS n

Left Join
---Pending in FO1
(Select t.doc, ISNULL(COUNT(t.clmn ), 0)as pendinfo1
from t2dibpend t
where DIB_MVT_SEQ_NUM IS NULL and t.clmn in (select clmn from specnew.dbo.people where completedt is null)
group by t.doc)a
on n.doc = a.doc
Left Join

--Pending in FO2
(select t.doc, ISNULL(COUNT(t.clmn ), 0) as PendInFO2, max(dib_mvt_seq_num) as MaxFO2
From t2DibPend AS t
where(DIB_MVT_TYP='r') and not (org_id like 'R%' or org_id like 'S%' or org_id like 'V%' or
org_id like 'P%') and t.clmn in (select clmn from specnew.dbo.people where completedt is null)
group by t.doc)b
on n.doc=b.doc
LEFT JOIN

--Pending in FO3
(select t.doc, ISNULL(COUNT(t.clmn ), 0) as PendInFO3, max(dib_mvt_seq_num)as MaxFO3
From t2DibPend t
where (DIB_MVT_TYP='t')
group by t.doc)c
on n.doc = c.doc
LEFT JOIN

--Pending in FO1 > 300
(Select t.doc, ISNULL(COUNT(t.clmn ), 0) as PendFOover1
from t2dibpend t
where DIB_MVT_SEQ_NUM IS NULL and app_rcpdt< dateadd(dd,-300,getdate()) and t.clmn in (select clmn from specnew.dbo.people where completedt is null)
group by t.doc)d
on n.doc = d.doc
LEFT JOIN

--Pending in FO2 > 300
(select t.doc, ISNULL(COUNT(t.clmn ), 0) as PendFOover2, max(dib_mvt_seq_num) as MaxoverFO2
From t2DibPend AS t
where(DIB_MVT_TYP='r') and not (org_id like 'R%' or org_id like 'S%' or org_id like 'V%' or
org_id like 'P%') and app_rcpdt< dateadd(dd,-300,getdate()) and t.clmn in (select clmn from specnew.dbo.people where completedt is null)
group by t.doc)f
on n.doc=f.doc
LEFT JOIN
--Pending in FO3 > 300
(select t.doc, ISNULL(COUNT(t.clmn ), 0) as PendFOover3, max(dib_mvt_seq_num)as MaxoverFO3
From t2DibPend t
where (DIB_MVT_TYP='t')and app_rcpdt< dateadd(dd,-300,getdate()) and t.clmn in (select clmn from specnew.dbo.people where completedt is null)
group by t.doc)g
on n.doc=g.doc

2. I didn't know how to combine all three of the queries together so I did them separately and in the other table I'm adding them up to get what I want. So in the next query I'm doing this to get what I want:

(SELECT t.doc, SUM(t.pendinfo1 + t.pendinfo2 + t.pendinfo3) AS pendinfo
FROM (SELECT *
FROM #temppd) t
GROUP BY t.doc)d
ON n.doc = d.doc

LEFT JOIN
(SELECT t.doc, sum(t. PendFOover1 + t.PendFOover2 + t.PendFOover3) as pendfo300
FROM (SELECT *
FROM #temppd) t
GROUP BY t.doc)f
ON n.doc = f.doc

Is there an easier way to combine all? It does give me what I want.

nc297 · Dec 22, 2010

Sorry for all of the questions but creating subqueries are new to me. If it's too much don't worry about it as it's big but I get the results that I want. I didn't send the create table info as it's so much. Thanks!

markros · Dec 22, 2010

If I need to create a complex query consisting of multiple subqueries, I usually approach it this way:

Code:

;with cte1 as (my FIRST SUBQUERY),
cte2 as (my Second Subquery)

select c1.Fields, c2.Fields from cte1 c1
inner join cte2 c2 on c1.PK = c2.FK

In other words, to improve maintainability and readability I try to break the query into smaller pieces and then combine together.

PluralSight Learning Library

gmmastros · Dec 22, 2010

There's a lot of simplifications that can be done here. Is your primary concern here related to performance?

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

nc297 · Dec 22, 2010

Yes it is. I would like it to run faster.

nc297 · Dec 22, 2010

So you are saying to put them both in two different CTE's to make it more readable? Thanks.

gmmastros · Dec 22, 2010

Looking at your code, I do see plenty of opportunity for performance improvements (as well as code simplification).

For example, I see:
[tt]
and not (org_id like 'R%'
or org_id like 'S%'
or org_id like 'V%'
or org_id like 'P%')
[/tt]

This can be simplified to:
[tt]And Not (org_id like '[RSVP]%')[/tt]

This basically says, "Where the first letter of org_id is R or S Or V Or P".

But... this will probably not help your performance at all. This is merely code simplification, but it does (in my opinion) make the code easier to read.

From a performance perspective, the biggest improvement I see is to combine the 6 derived tables in to 1. When I looked at your query, I noticed that there are 6 left join, and that each left join was to a derived table. Each of the derived table was using a single table and each derived table joined to the base table (Natdocfile) on the same column. It is all of these similarities between the 6 queries that got me to thinking.... how can we reduce this to one query? Well... each query is has a different where clause, as we seem to be returning counts and max's. So... we can move the where clause in to the count and/or max allowing us to perform all of the calculations in one query. Something like this...

Code:

SELECT  n.DOC, 
        ISNULL(pendinfo1, 0) AS PendInFO1, 
        ISNULL(pendinfo2,0) AS pendinfo2, 
        ISNULL(pendinfo3,0) AS pendinfo3,
        ISNULL(pendFOover1,0) AS pendFOover1,
        ISNULL(pendFOover2,0) AS pendFOover2,
        ISNULL(pendFOover3,0) AS pendFOover3
into    #temppd
FROM    Natdocfile AS n 
        Left Join
          (
          Select t.doc, 
                 ---Pending in FO1
                 ISNULL(COUNT(Case When DIB_MVT_SEQ_NUM IS NULL 
                                        and t.clmn in (select clmn from specnew.dbo.people where completedt is null)
                                   Then t.clmn End), 0)as pendinfo1,
                 --Pending in FO2						                  
                 ISNULL(COUNT(Case When DIB_MVT_TYP='r'
                                        and not (org_id like '[RSVP]%') 
                                        and t.clmn in (select clmn from specnew.dbo.people where completedt is null)
                                   Then t.clmn End ), 0) as PendInFO2, 
                 max(Case When DIB_MVT_TYP='r'
                               and not (org_id like '[RSVP]%') 
                               and t.clmn in (select clmn from specnew.dbo.people where completedt is null)
                          Then dib_mvt_seq_num End) as MaxFO2,
                 
                 --Pending in FO3
                 ISNULL(COUNT(Case When DIB_MVT_TYP='t' Then t.clmn End), 0) as PendInFO3, 
                 max(Case When DIB_MVT_TYP='t' Then dib_mvt_seq_num End)as MaxFO3,

                 --Pending in FO1 > 300
                 ISNULL(COUNT(Case When DIB_MVT_SEQ_NUM IS NULL 
                                        and app_rcpdt < dateadd(dd,-300,getdate()) 
                                        and t.clmn in (select clmn from specnew.dbo.people where completedt is null)
                                   Then t.clmn End ), 0) as PendFOover1,
                 --Pending in FO2 > 300
                 ISNULL(COUNT(Case When DIB_MVT_TYP='r'
                                        and not (org_id like '[RSVP]%') 
                                        and app_rcpdt < dateadd(dd,-300,getdate()) 
                                        and t.clmn in (select clmn from specnew.dbo.people where completedt is null)
                                   Then t.clmn End), 0) as PendFOover2, 
                 max(Case When DIB_MVT_TYP='r'
                               and not (org_id like '[RSVP]%') 
                               and app_rcpdt < dateadd(dd,-300,getdate()) 
                               and t.clmn in (select clmn from specnew.dbo.people where completedt is null)
                          Then dib_mvt_seq_num) as MaxoverFO2,

                 --Pending in FO3 > 300
                 ISNULL(COUNT(Case When DIB_MVT_TYP='t' 
                                        and app_rcpdt < dateadd(dd,-300,getdate()) 
                                        and t.clmn in (select clmn from specnew.dbo.people where completedt is null)
                                   Then t.clmn End), 0) as PendFOover3, 
                 max(Case When DIB_MVT_TYP='t' 
                               and app_rcpdt < dateadd(dd,-300,getdate()) 
                               and t.clmn in (select clmn from specnew.dbo.people where completedt is null)
                          Then dib_mvt_seq_num End)as MaxoverFO3 

                 from t2dibpend t
                 group by t.doc) a
              on n.doc = a.doc

Notice how the where clause criteria was moved in to the aggregate functions. This should allow the aggregates to return the same output. More importantly, notice that the t2dibpend table is only referenced once. If I had to guess, I would say that this will reduce the query time by a factor of 5 (5 times faster).

Can you please post the execution time for your original query, and the execution time for this one. Also, make sure you verify that both queries return the same data.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

nc297 · Dec 22, 2010

WOW thanks gmmastros! I see what you have done and thanks for helping me simplify things too.

I tried it and getting:

Msg 130, Level 15, State 1, Line 1
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.

Line one is
SELECT n.DOC,

My query takes 45 seconds to run!

gmmastros · Dec 22, 2010

I almost never use subqueries, so I didn't realize this restriction existed. This can be fixed.

try this:

Code:

SELECT  n.DOC, 
        ISNULL(pendinfo1, 0) AS PendInFO1, 
        ISNULL(pendinfo2,0) AS pendinfo2, 
        ISNULL(pendinfo3,0) AS pendinfo3,
        ISNULL(pendFOover1,0) AS pendFOover1,
        ISNULL(pendFOover2,0) AS pendFOover2,
        ISNULL(pendFOover3,0) AS pendFOover3
into    #temppd
FROM    Natdocfile AS n 
        Left Join
          (
          Select t.doc, 
                 ---Pending in FO1
                 ISNULL(COUNT(Case When DIB_MVT_SEQ_NUM IS NULL 
                                        and People.clmn is not null
                                   Then t.clmn End), 0)as pendinfo1,
                 --Pending in FO2						                  
                 ISNULL(COUNT(Case When DIB_MVT_TYP='r'
                                        and not (org_id like '[RSVP]%') 
                                        and People.clmn is not null
                                   Then t.clmn End ), 0) as PendInFO2, 
                 max(Case When DIB_MVT_TYP='r'
                               and not (org_id like '[RSVP]%') 
                               and People.clmn is not null
                          Then dib_mvt_seq_num End) as MaxFO2,
                 
                 --Pending in FO3
                 ISNULL(COUNT(Case When DIB_MVT_TYP='t' Then t.clmn End), 0) as PendInFO3, 
                 max(Case When DIB_MVT_TYP='t' Then dib_mvt_seq_num End)as MaxFO3,

                 --Pending in FO1 > 300
                 ISNULL(COUNT(Case When DIB_MVT_SEQ_NUM IS NULL 
                                        and app_rcpdt < dateadd(dd,-300,getdate()) 
                                        and People.clmn is not null
                                   Then t.clmn End ), 0) as PendFOover1,
                 --Pending in FO2 > 300
                 ISNULL(COUNT(Case When DIB_MVT_TYP='r'
                                        and not (org_id like '[RSVP]%') 
                                        and app_rcpdt < dateadd(dd,-300,getdate()) 
                                        and People.clmn is not null
                                   Then t.clmn End), 0) as PendFOover2, 
                 max(Case When DIB_MVT_TYP='r'
                               and not (org_id like '[RSVP]%') 
                               and app_rcpdt < dateadd(dd,-300,getdate()) 
                               and People.clmn is not null
                          Then dib_mvt_seq_num End) as MaxoverFO2,

                 --Pending in FO3 > 300
                 ISNULL(COUNT(Case When DIB_MVT_TYP='t' 
                                        and app_rcpdt < dateadd(dd,-300,getdate()) 
                                        and People.clmn is not null
                                   Then t.clmn End), 0) as PendFOover3, 
                 max(Case When DIB_MVT_TYP='t' 
                               and app_rcpdt < dateadd(dd,-300,getdate()) 
                               and People.clmn is not null
                          Then dib_mvt_seq_num End)as MaxoverFO3 

                 from t2dibpend t
                      Left Join specnew.dbo.people as People
                        On t.clmn = People.clmn
                        And People.completedt is null
                 group by t.doc) a
              on n.doc = a.doc

If this works (and returns the correct data), I will explain the change I made.

Also... if there can be duplicate clmn numbers in the people table, this query is likely to return incorrect numbers. If that's the case, let me know and I will modify this again.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

nc297 · Dec 22, 2010

GREAT it works! It only took 19 seconds to run too!

What did you change?

There are duplicates in the table because of the DIB_MVT_SEQ_NUM. that's okay because I'm getting the max DIB_MVT_SEQ_NUM which only returns one record.

Thanks!

gmmastros · Dec 22, 2010

I removed the subqueries and did a left join instead. Here:

Code:

                 from t2dibpend t
                      Left Join specnew.dbo.people as People
                        On t.clmn = People.clmn
                        And People.completedt is null

Then, where the sub queries existed...

Code:

ISNULL(COUNT(Case When DIB_MVT_SEQ_NUM IS NULL 
                       and [!]People.clmn is not null[/!]
                  Then t.clmn End), 0)as pendinfo

19 seconds is certainly an improvement, but it still seems awfully slow. Can you answer a few questions for me?

1. How many rows are in the t2dibpend?
2. How many rows are in specnew.dbo.people?

3. How many rows do you get from this query?

[tt]
Select Count(*)
From specnew.dbo.people
Where completedt is null
[/tt]

4. What indexes do you have on specnew.dbo.people?
[tt]sp_helpindex 'specnew.dbo.people'[/tt]

I'm thinking that the next stage of query optimization is likely to involve adding indexes to the table(s). Is this something you are allowed to do within your database?

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

nc297 · Dec 22, 2010

Answers to your questions:

1. How many rows are in the t2dibpend? -- 1480058
2. How many rows are in specnew.dbo.people?-- 424383

3. How many rows do you get from this query?-- 89677

Select Count(*)
From specnew.dbo.people
Where completedt is null

4. What indexes do you have on specnew.dbo.people?
sp_helpindex 'specnew.dbo.people'

I don't know how can I tell what the index is on? I can't index the people table but I can on the t2dibpend table.

nc297 · Dec 22, 2010

Here's the People table script info.

CREATE TABLE [dbo].[people](
[clmn] [char](6) NOT NULL,
[lname] [char](6) NULL,
[fi] [char](1) NULL,
[dob] [datetime] NULL,
[sex] [char](1) NULL,
[completedt] [datetime] NULL,
[pendingdt] [datetime] NULL,
CONSTRAINT [PK_PEOPLE_CLMN] PRIMARY KEY CLUSTERED
(
[clmn] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]

GO
SET ANSI_PADDING OFF

gmmastros · Dec 22, 2010

I was thinking that an index on the people table that had 2 columns (clmn,completedt) would probably help this query. But, it looks like 20+% of the rows have null for completedt, so that would probably do an index scan anyway.

Can you show me what indexes exist on t2dibpend?

IN a query window, run:

sp_helpindex 't2dibpend'

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

nc297 · Dec 22, 2010

This is what I have:

PK_T2DibPend1 clustered, unique, primary key located on PRIMARY T12id

gmmastros · Dec 22, 2010

try this:

Code:

Create Index idx_t2DibPend_doc On T2DibPend(doc)

Then, run the query again. Did it get any faster?

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

nc297 · Dec 22, 2010

Bummer I just left my office. I will have to try it again in the morning. Thanks again for all of your help.

nc297 · Dec 23, 2010

Good Morning, it took 15 seconds to run. That's fine for me.

I have a couple of questions to help me understand all a bit better.

So I'm using a Case statement which takes the place of the where clause?

Could you explain this case statement:
1. ---Pending in FO1
ISNULL(COUNT(Case When DIB_MVT_SEQ_NUM IS NULL
and People.clmn is not null
Then t.clmn End), 0)as pendinfo1

I see you moved the two tables down to the bottom why?
2.from t2dibpend t
Left Join specnew.dbo.people as People
On t.clmn = People.clmn
And People.completedt is null

Why did you split this one up with the count then the max?
3. --Pending in FO3 > 300
ISNULL(COUNT(Case When DIB_MVT_TYP='t'
and app_rcpdt < dateadd(dd,-300,getdate())
and People.clmn is not null
Then t.clmn End), 0) as PendFOover3,

max(Case When DIB_MVT_TYP='t'
and app_rcpdt < dateadd(dd,-300,getdate())
and People.clmn is not null
Then dib_mvt_seq_num End)as MaxoverFO3

Again thanks for all of your help!

gmmastros · Dec 23, 2010

1. Could you explain this case statement:

I think it is probably easier to explain this with an example. Before I do, it's important you realize that aggregate functions ignore NULL values.

To see this in action...

Code:

Declare @Temp Table(Data Int)

Insert Into @Temp Values(1)
Insert Into @Temp Values(2)
Insert Into @Temp Values(3)
Insert Into @Temp Values(NULL)
Insert Into @Temp Values(NULL)

Select Count(*) As AllRowCount,
       Count(Data) As NonNullRowCount
From   @Temp

Copy/paste that code in to a query window and run it. You'll see that AllRowCount = 5 and NonNullRowCount = 3. This NonNullRowCount = 3 because there are 3 rows that are not null (the rows with 1, 2, and 3).

Next thing to realize is that Case/When will return NULL if none of the conditions are satisfied. Notice that in the code I wrote earlier (yesterday), the case statements don't have an ELSE clause. If the data in a case/when doesn't satisfy any condition, NULL is returned.

Ex:

Code:

Declare @Temp Table(Data Int)

Insert Into @Temp Values(1)
Insert Into @Temp Values(2)
Insert Into @Temp Values(3)
Insert Into @Temp Values(NULL)
Insert Into @Temp Values(NULL)

Select Data,
       Case When Data <= 2 
            Then Data 
            Else NULL 
            End As CaseColumn
From   @Temp

[tt][blue]

Data CaseColumn
----------- -----------
1 1
2 2
3 NULL
NULL NULL
NULL NULL
[/blue][/tt]

Notice that the case statement effectively returns the data column if the data is less than or equal to 2.

Now (and here's the key), if we COUNT the 'case column', we will effectively get the number of rows where data is less than or equal to 2.

Code:

Declare @Temp Table(Data Int)

Insert Into @Temp Values(1)
Insert Into @Temp Values(2)
Insert Into @Temp Values(3)
Insert Into @Temp Values(NULL)
Insert Into @Temp Values(NULL)

Select Count(Case When Data <= 2 
                  Then Data 
                  Else NULL 
                  End) As CaseColumnCount
From   @Temp

We could have accomplished the same thing with a where clause.

Code:

Declare @Temp Table(Data Int)

Insert Into @Temp Values(1)
Insert Into @Temp Values(2)
Insert Into @Temp Values(3)
Insert Into @Temp Values(NULL)
Insert Into @Temp Values(NULL)

Select Count(Data) As SomeCount
From   @Temp
Where  Data <= 2

Both of the previous 2 queries return the same result. It's important to understand the differences in the queries though. I'm specifically talking about HOW sql server calculates the results for you. The 2 queries may return the same data, but sql server uses different methods to return the data. If there is an index on the data column, then it is faster to use a where clause because SQL Server will use the index to locate the matching rows. When you use the case/when method, SQL Server will examine every row and process the case/when on the data. Since SQL Server cannot use an index to quickly exclude rows, the case/when method will be slower.

Ok, so now you're thinking... the case/when method is slower, but your query runs faster. How can this be? Well, your query was essentially using 6 queries to accumulate the data. With my modified query, there's just one. There are more calculations performed in that one query, and every row MUST be examined, but this is still faster than looking at the table 6 times.

2. I see you moved the two tables down to the bottom why?

The 2nd table (specnew.dbo.people) needed to be moved because you originally had it as a subquery. SQL Server does not allow subqueries in aggregate functions (like count, sum, min, max, avg, etc...).

This table was used to determine the existence of clmn in the table. By left joining to this table and checking for nulls, we are essentially doing the same thing, but without needing to use a subquery.

3. Why did you split this one up with the count then the max?

The explanation for this is the same as question number 1. Your original query had a COUNT and a MAX with the same where clause. Since the where clause was different between the 6 queries, I needed to move the where clause in to the aggregate function and implement it with a case/when so that we can check the nulls (and count the non-nulls).

By examining the data once and performing all of the calculations (instead of 6 times) we reduce the query time.

Does this make sense?

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

nc297 · Dec 23, 2010

All I can say is WOW and thank you so much gmmastros! I do understand this and it makes all sense. I have another query that I set "my way" which I will go in and change. This was is much easier to read.

Funny you answered my question regarding why not use the where clause.

One thing about Indexing...Do I always place it in a table and on the PK? Can I have two indexes in the same table? Is there max number of indexs for a table?

You have helped me learn so much I really appreciate your help (the other guys too) ;-) Thanks so much for answering my posts!

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

temp tables

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Similar threads

Log in

Part and Inventory Search

Sponsor