Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Query Performance Improvement Tips Welcomed! 2

Status
Not open for further replies.

barnettjacob

IS-IT--Management
Aug 12, 2011
31
GB
Hi, I've got the following query that I use in PowerPivot to bring through a chunk of the product table from our SQL Server Database. It worked brilliantly until I decided to throw a load of other stuff in there in terms of pulling in sales and goods receipt data from other tables - from a PowerPivot point of view this is hugely beneficial however its now taking an eternity for the thing to run. If anybody has any ideas as to what I could do the improve performance that would be appreciated!

Regards
Jacob

SQL:
select

Distinct(p.ProductCode) as 'Productcode',
Description,
websiteshortdescription as 'Description 2',

(case 
when p.analysis1code = 'baby' or p.analysis1code = 'home' or p.analysis1code = 'MC H&T' then 'H&T' 
when p.analysis1code = 'mcclothing' or p.analysis1code = 'fashion' then 'Apparel' 
when p.analysis1code = 'elc' or p.analysis1code = 'giftware' then 'Toy and Gift' else 'other' end) as 'Main Division',

(select d.description from eee..analysis2 d where d.analysis2code=p.analysis2code) as Department,
(select e.description from eee..analysis3 e where e.analysis3code=p.analysis3code) as Class,
(select f.description from eee..analysis4 f where f.analysis4code=p.analysis4code) as 'Sub Class',
p.Analysis6Code,
p.Analysis7Code as Concession,
p.Analysis5Code as Story,
p.analysis8code as Origin,
(select g.name from eee..supplier g where g.suppliercode=p.supplier1) as Supplier,
c.BuyPrice1 as Cost,
p.StyleCode as Style,
p.ColourCode as Colour,
p.SizeCode as Size,
Purchasepacksize as 'Pack Size',
p.SeasonCode as 'Season',
pr.RetailPrice as 'Retail Price',
pr.fullprice as 'Full Price',

p.extrainfo6 as 'Compliance Status',
p.extrainfo7 as 'Price Status',
p.extrainfo15 as 'Order Type',

(select g.description from eee..brand g where g.brandcode=p.brandcode) as Brand,

(SELECT CONVERT(VARCHAR(10), (select min(sm.date) from eee..stockmovementtbl sm where sm.description = 'goods receipt' and sm.productcode = p.productcode), 103)) as 'First Receipt',
(SELECT CONVERT(VARCHAR(10), (select max(sm.date) from eee..stockmovementtbl sm where sm.description = 'goods receipt' and sm.productcode = p.productcode), 103)) as 'Last Receipt',
(SELECT CONVERT(VARCHAR(10), (select min(s.saledate) from eee..vwsaleline_gp s where s.productcode = p.productcode), 103)) as 'First Sale',
(SELECT CONVERT(VARCHAR(10), (select max(s.saledate) from eee..vwsaleline_gp s where s.productcode = p.productcode), 103)) as 'Last Sale',

(select sum(coalesce(s.fcextendednetamount-s.fcextendedtaxamount,0)) from eee..vwsaleline_gp s where s.productcode = p.productcode and s.saledate >= '07/04/12') as 'YTD Sales',
(select sum(coalesce(s.grossprofitalternateGP,0)) from eee..vwsaleline_gp s where s.productcode = p.productcode and s.saledate >= '07/04/12') as 'YTD GP$',
(select sum(coalesce(s.quantity,0)) from eee..vwsaleline_gp s where s.productcode = p.productcode and s.saledate >= '07/04/12') as 'YTD Units'

from eee..product p
left join eee..Cost c on c.ProductCode = p.ProductCode and c.BranchCode = '99'
left join eee..price pr on pr.ProductCode = p.ProductCode and pr.BranchCode = '99'


where (Purge = 0 or (select sum(s.fcextendednetamount-s.fcextendedtaxamount) from eee..vwsaleline_gp s where s.productcode = p.productcode and s.saledate >= '06/27/10') <> 0)

 
The performance problems you are experiencing can be resolved by rewriting the query to not use sub-queries. Just to clarify, a sub-query is when you write a query as a column in a larger query, like this:

Code:
select  ([!] select d.description 
          from   eee..analysis2 d 
          where  d.analysis2code=p.analysis2code[/!]) as Department
from    eee..product p

Sub-queries are evaluated once per row. So... if there's 1000 rows in the product table, the sub-query will be evaluated 1000 times. By rewriting this to use a derived table (or a common table expression) your performance will improve.

Ex:


Code:
select  d.description As Department
from    eee..product p
        Left Join eee..analysis2 d
          on d.analysis2code=p.analysis2code

In your original query, you have a lot of sub-queries. Many of them are using the same table and have similar join conditions. Some of the where clauses are different, but those can be handled by using a case statement so that the correct results are returned.

Like this:

Code:
select  Distinct(p.ProductCode) as 'Productcode',
        Description,
        websiteshortdescription as 'Description 2',
        (case when p.analysis1code In ('baby','home','MC H&T')
		      then 'H&T' 
              when p.analysis1code In ('mcclothing','fashion')
			  then 'Apparel' 
              when p.analysis1code In ('elc','giftware')
			  then 'Toy and Gift' 
			  else 'other' end) as 'Main Division',
        d.description as Department,
        e.description as Class,
        f.description as 'Sub Class',
        p.Analysis6Code,
        p.Analysis7Code as Concession,
        p.Analysis5Code as Story,
        p.analysis8code as Origin,
        g.name as Supplier,
        c.BuyPrice1 as Cost,
        p.StyleCode as Style,
        p.ColourCode as Colour,
        p.SizeCode as Size,
        Purchasepacksize as 'Pack Size',
        p.SeasonCode as 'Season',
        pr.RetailPrice as 'Retail Price',
        pr.fullprice as 'Full Price',
        
        p.extrainfo6 as 'Compliance Status',
        p.extrainfo7 as 'Price Status',
        p.extrainfo15 as 'Order Type',

        g.description as Brand,
        
        CONVERT(VARCHAR(10), sm.FirstReciipt, 103) as 'First Receipt',
        CONVERT(VARCHAR(10), sm.LastReceipt, 103) as 'Last Receipt',
        CONVERT(VARCHAR(10), s.MinSaleDate, 103) as 'First Sale',
        CONVERT(VARCHAR(10), s.MaxSaleDate, 103) as 'Last Sale',
        
        Coalesce(s.YTDSales, 0) as 'YTD Sales',
        coalesce(s.YTDGP$, 0) as 'YTD GP$',
        coalesce(s.YTDUnits,0) as 'YTD Units'
from    eee..product p
        left join eee..Cost c 
		   on c.ProductCode = p.ProductCode and c.BranchCode = '99'
        left join eee..price pr 
		   on pr.ProductCode = p.ProductCode and pr.BranchCode = '99'
		Left Join (
		    Select  Max(SaleDate) As MaxSaleDate,
			        Min(SaleDate) As MinSaleDate,
					Sum(Case When SaleDate >= '07/04/12' 
					         Then s.fcextendednetamount-s.fcextendedtaxamount 
							 Else 0 End) As YTDSales,
				    Sum(Case When SaleDate >= '07/04/12'
					         Then grossprofitalternateGP
							 Else 0 End) as YTDGP$,
					Sum(Case When SaleDate >= '07/04/12'
					         Then s.quantity
							 Else 0 End) As YTDUnits,
					Sum(Case When SaleDate >= '06/27/10' 
					         Then s.fcextendednetamount-s.fcextendedtaxamount 
							 Else 0 End) As OtherSalesAmount,
					ProductCode
			From    eee.vwsaleline_gp
			Group By ProductCode
			) As s
			On p.ProductCode = s.ProductCode
		Left Join (
            select min(sm.date) As LastReceipt,
			       max(sm.date) As FirstReceipt,
				   ProductCode
		    from   eee..stockmovementtbl sm 
			where  sm.description = 'goods receipt'
			Group By ProductCode
            ) As sm
			On sm.productcode = p.productcode
		Left Join eee.analysis2 d
            on d.analysis2code=p.analysis2code
        Left Join eee..analysis3 e 
		    on e.analysis3code=p.analysis3code
        Left Join eee..analysis4 f 
		    on f.analysis4code=p.analysis4code
		Left Join eee..supplier g 
		    on g.suppliercode=p.supplier1
		Left Join eee..brand g 
		    on g.brandcode=p.brandcode
where   (Purge = 0 or S.OtherSalesAmount <> 0)

I wasn't able to test the query, so there may be minor syntax errors, but the approach to fixing the performance issues is sound.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
George, I can't say thankyou enough! This is amazing, it turns an 11 minute query into one that takes 48 seconds.
Regards
Jacob
 
Honestly, I expected better performance than that. 48 seconds still seems like an eternity while waiting for data.

Have you looked at the execution plan to make sure you are doing index seeks instead of scans? Do you know how to look at an execution plan? I wouldn't be surprised if adding indexes to your tables would further improve performance.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
George, you've lost me slightly now!

The tables definitely have indexes (at least according to the schema). With the assistance of Google I tried to look at the execution plan and apparently I don't have permission (I'm using windows authentication).

The question is, if I can get a look at the plan what can I do to improve the performance?

Thanks again
Jacob
 
The query plan will give you insight in to how SQL Server executes your query. Even though there may be indexes on the table, that does not mean that those indexes will be useful to the query.

For example, suppose I handed you a telephone book and told you to give me the phone number for Barack Obama. You would quickly skip through the book and find the entry. It would only take you a couple seconds. Now, suppose I told you to find all the phone numbers for anyone with a first name of George. Now, you would need to look at each entry to determine if it matched your query criteria. Instead of a couple seconds, it would probably take a couple days.

Indexes in SQL are similar to that phone book. Depending on the query, and index will be extremely helpful (because it allows for index seeks) or not helpful at all (because it uses an index scan).

Indexes can have multiple keys (columns). Thinking again about the phone book, names are sorted by last name, and then by first name.

A table can have multiple indexes. Suppose there were 2 phone books. One sorted by Last Name then first name, and another phone book sorted by first name and then last name. This would support many different queries.

One problem with multiple indexes is that it costs performance whenever you want to modify the data. Specifically, inserts, updates, and deletes will take slightly longer because the indexes also need to be changed. The more indexes you have, the longer this process will take. Having several indexes on the same table is not a problem, having too many indexes on a table is. Creating effective indexes for a table is much like a balancing act. You need to find the right combination of indexes to speed up your queries without adversely affecting other queries.

You don't need any special permissions to view the execution plan for a query. There are several different ways to get the execution plan. You can see a graphical representation of it, or a text representation.

To view the graphical representation of a query plan, open SQL Server Management Studio, open a query window. Load your query. Press CTRL-M. You won't notice anything happening when you press CTRL-M, but now, when you run the query, there will be a new "Execution Plan" tab. This tab will show you each step that SQL Server goes through to execute the query. In particular, look for table scans, and index scans. If you see these, then there is a good opportunity for performance improvement by adding additional indexes.

You can also show the same information in text (suitable for pasting in to a forum post). To do this...

Code:
Set SHOWPLAN_TEXT ON
GO
-- Put your query here
go
Set SHOWPLAN_TEXT OFF



-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top