Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Need help optimizing large stored procedure 1

Status
Not open for further replies.

BobRodes

Instructor
May 28, 2003
4,215
US
We have a stored proc that runs VERY slowly, and upon examination it was clear even to me that many sins have been committed in the design here. I've been asked to suggest what I would do, and wanted to avail myself of everyone's experience. So, I have a few questions:

1. Does Count(*) slow everything down? Would it be better to count a single field?
2. Is it better to use CASE statements than going through the same table once for each field?
3. None of these tables are keyed. How much performance can I expect to get by keying the tables and indexing them appropriately?
4. Any other suggestions?

Here's the proc. I've abbreviated it and changed all the names of tables and fields for regulatory reasons.

TIA for any help.

Code:
USE [MYDB]
GO
/****** Object:  StoredProcedure [dbo].[MYDB_Summary]   ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

ALTER PROCEDURE [dbo].[MYDB_Summary] 
AS
BEGIN
SET NOCOUNT ON;

-- ===================================================================================
-- Set TANKERS sort fields
-- ===================================================================================

UPDATE	   [MYDB].[dbo].[TANKERS]
SET			SortTankerTaxID = TaxID,
   			SortTankerName  = Name,
			SortTankerState = State;

UPDATE	   [MYDB].[dbo].[TANKERS]
SET			SortTankerTaxID = CorrectedTaxID
WHERE       CorrectedTaxID <> ' ';

UPDATE	   [MYDB].[dbo].[TANKERS]
SET			SortTankerName  = CorrectedName
WHERE       CorrectedName <> ' ';

UPDATE	   [MYDB].[dbo].[TANKERS]
SET			SortTankerState = CorrectedState
WHERE       CorrectedState <> ' ';



-- ===================================================================================
-- Set MANIFESTS tanker information
-- ===================================================================================

UPDATE	   [MYDB].[dbo].[MANIFESTS]
SET			TankerName  = (SELECT SortTankerName
						 	 FROM  [MYDB].[dbo].[TANKERS]
							 WHERE  MANIFESTS.TankerID = TANKERS.TankerID),
   			TankerState = (SELECT SortTankerState
							 FROM  [MYDB].[dbo].[TANKERS]
							 WHERE  MANIFESTS.TankerID = TANKERS.TankerID);

*********** Same sort of thing in 3 other tables.



-- ===================================================================================
-- Set DRIVERS totals
-- ===================================================================================

UPDATE	   [MYDB].[dbo].[DRIVERS]
SET			UCFCodesReceived = (SELECT COUNT(*) FROM [MYDB].[dbo].[LOADS]
						         WHERE  LOADS.PackListID = DRIVERS.PackListID
						         AND    LOADS.FuelTypeID    = DRIVERS.FuelTypeID
						         AND    LOADS.RequestID    = DRIVERS.RequestID
						         AND    LOADS.TankerID   = DRIVERS.TankerID
						         AND    LOADS.DriverID     = DRIVERS.DriverID);

UPDATE	   [MYDB].[dbo].[DRIVERS]
SET			UCFCodesReceived = 0
WHERE       UCFCodesReceived IS NULL;

UPDATE	   [MYDB].[dbo].[DRIVERS]
SET			RAUCFCodesReceived = (SELECT COUNT(*) FROM [MYDB].[dbo].[LOADS]
						           WHERE  LOADS.PackListID = DRIVERS.PackListID
						           AND    LOADS.FuelTypeID    = DRIVERS.FuelTypeID
						           AND    LOADS.RequestID    = DRIVERS.RequestID
						           AND    LOADS.TankerID   = DRIVERS.TankerID
						           AND    LOADS.DriverID     = DRIVERS.DriverID
 				                   AND    LOADS.ICCCode <> ' ');


UPDATE       [MYDB].[dbo].[DRIVERS]
SET            RAUCFCodesReceived = 0
WHERE       RAUCFCodesReceived IS NULL;

********** 4 more of these for other summary fields


-- ===================================================================================
-- Set MANIFESTS totals
-- ===================================================================================

UPDATE	   [MYDB].[dbo].[MANIFESTS]
SET			Requests = (SELECT COUNT(*) FROM [MYDB].[dbo].[DRIVERS]
						WHERE  DRIVERS.PackListID = MANIFESTS.PackListID
						AND    DRIVERS.FuelTypeID    = MANIFESTS.FuelTypeID
						AND    DRIVERS.RequestID    = MANIFESTS.RequestID
						AND    DRIVERS.TankerID   = MANIFESTS.TankerID);

UPDATE	   [MYDB].[dbo].[MANIFESTS]
SET			ManCt = 0
WHERE       ManCt IS NULL;

UPDATE	   [MYDB].[dbo].[MANIFESTS]
SET			Responses = (SELECT COUNT(*) FROM [MYDB].[dbo].[DRIVERS]
						 WHERE  DRIVERS.PackListID = MANIFESTS.PackListID
						 AND    DRIVERS.FuelTypeID    = MANIFESTS.FuelTypeID
						 AND    DRIVERS.RequestID    = MANIFESTS.RequestID
						 AND    DRIVERS.TankerID   = MANIFESTS.TankerID
                         AND   (DRIVERS.Status = 'EX'
   				         OR     DRIVERS.Status = 'FH'
   				         OR     DRIVERS.Status = 'DP'             
						 OR     DRIVERS.Status = 'NP'));

UPDATE	   [MYDB].[dbo].[MANIFESTS]
SET			Responses = 0
WHERE       Responses IS NULL;

************* 8 more like this  ****************************

-- ===================================================================================
-- Set BILLSOFLADING totals
-- ===================================================================================

UPDATE	   [MYDB].[dbo].[BILLSOFLADING]
SET			Requests = (SELECT SUM(Requests) FROM [MYDB].[dbo].[MANIFESTS]
			            WHERE  MANIFESTS.PackListID = BILLSOFLADING.PackListID
			            AND    MANIFESTS.FuelTypeID    = BILLSOFLADING.FuelTypeID
			            AND    MANIFESTS.RequestID    = BILLSOFLADING.RequestID);

UPDATE	   [MYDB].[dbo].[BILLSOFLADING]
SET			Responses = (SELECT SUM(Responses) FROM [MYDB].[dbo].[MANIFESTS]
			             WHERE  MANIFESTS.PackListID = BILLSOFLADING.PackListID
			             AND    MANIFESTS.FuelTypeID    = BILLSOFLADING.FuelTypeID
			             AND    MANIFESTS.RequestID    = BILLSOFLADING.RequestID);

**************** 9 more of these


-- ===================================================================================
-- Set TANKERS totals
-- ===================================================================================

UPDATE	   [MYDB].[dbo].[TANKERS]
SET			Requests = (SELECT SUM(Requests) FROM [MYDB].[dbo].[MANIFESTS]
			            WHERE  MANIFESTS.PackListID = TANKERS.PackListID
			            AND    MANIFESTS.TankerID    = TANKERS.TankerID);

UPDATE	   [MYDB].[dbo].[TANKERS]
SET			Responses = (SELECT SUM(Responses) FROM [MYDB].[dbo].[MANIFESTS]
			             WHERE  MANIFESTS.PackListID = TANKERS.PackListID
			             AND    MANIFESTS.TankerID    = TANKERS.TankerID);

************ 7 more of these


END
 
1. Does Count(*) slow everything down? Would it be better to count a single field?

No. But, you should be aware that there is a difference between Count(*) and Count(ColumnName). The Count aggregate function ignores null.

Ex:
Code:
Declare @Temp Table(ID Int)

Insert Into @Temp Values(1)
Insert Into @Temp Values(2)
Insert Into @Temp Values(NULL)

Select Count(*) As [Count_*],
       Count(1) As Count_1,
       Count(Id) As Count_ID
From @Temp

Because there is one NULL in the ID column, Count(Id) = 2 whereas the other columns return the value 3. Some people will argue that Count(*) is slower than Count(1). So, if you want a total count, go for the 1. You could also use a column that cannot contain a NULL, like a Primary Key, but the 1 works, so go with that. I wouldn't expect much difference from it though.

I mean... you posted here because the performance is bad. The difference between Count(*) and Count(1) will probably be negligible here.

2. Is it better to use CASE statements than going through the same table once for each field?

Heck yeah. That's where most of your performance gains are going to come from.

3. None of these tables are keyed. How much performance can I expect to get by keying the tables and indexing them appropriately?

A table without a primary key is called a "heap table". Actually, a table without a clustered index is considered a heap table, but SQL server (by default) creates a clustered index on the primary key, so it's essentially the same thing.

Regardless, I suggest you read this:



4. Any other suggestions?

I suggest you stop using sub-queries. Sub-queries are a performance killer.

Try this:

Code:
-- ===================================================================================
-- Set TANKERS sort fields
-- ===================================================================================

UPDATE [MYDB].[dbo].[TANKERS]
SET    SortTankerTaxID = Case When CorrectedTaxId <> ' ' 
                              Then CorrectedTaxId
                              Else TaxID End,
       SortTankerName  = Case When CorrectedName <> ' ' 
                              Then CorrectedName
                              Else Name End,
       SortTankerState = Case When CorrectedState <> ' '
                              Then CorrectedState
                              Else State End;

And this:

Code:
-- ===================================================================================
-- Set MANIFESTS tanker information
-- ===================================================================================

UPDATE [MYDB].[dbo].[MANIFESTS]
SET    [MANIFESTS].TankerName  = [TANKERS].SortTankerName,
       [MANIFESTS].TankerState = [TANKERS].SortTankerState
From   [MYDB].[dbo].[MANIFESTS]
       INNER JOIN [MYDB].[dbo].[TANKERS]
         ON MANIFESTS.TankerID = TANKERS.TankerID


*********** Same sort of thing in 3 other tables.


And also this:

Code:
-- ===================================================================================
-- Set DRIVERS totals
-- ===================================================================================

UPDATE [MYDB].[dbo].[DRIVERS]
SET    UCFCodesReceived = Coalesce(CountAlias.TotalCount, 0),
       RAUCFCodesReceived = Coalesce(CountAlias.CountRAUCFCodesReceived)
From   [MYDB].[dbo].[DRIVERS]
       LEFT JOIN (
         SELECT LOADS.PackListId,
                LOADS.FuelTypeId,
                LOADS.RequestId,
                LOADS.TankerID,
                LOADS.DriverId,
                COUNT(*) As TotalCount,
                Count(Case When LOADS.ICCCode <> ' ' Then 1 END) As CountRAUCFCodesReceived
         FROM   [MYDB].[dbo].[LOADS]
         Group By LOADS.PackListID,
                LOADS.FuelTypeID,
                LOADS.RequestID,
                LOADS.TankerID,
                LOADS.DriverID
         ) As CountAlias
         On  LOADS.PackListID = CountAlias.PackListID
         AND LOADS.FuelTypeID = CountAlias.FuelTypeID
         AND LOADS.RequestID  = CountAlias.RequestID
         AND LOADS.TankerID   = CountAlias.TankerID
         AND LOADS.DriverID   = CountAlias.DriverID

********** 4 more of these for other summary fields

By using derived tables (and avoiding sub queries) and combining multiple similar queries in to one, I would expect a drastic improvement.

You should create primary keys in each of the tables and you should also create composite indexes.

For example, looking at the previous query, we are joining tables on:
[tt]On LOADS.PackListID = CountAlias.PackListID
AND LOADS.FuelTypeID = CountAlias.FuelTypeID
AND LOADS.RequestID = CountAlias.RequestID
AND LOADS.TankerID = CountAlias.TankerID
AND LOADS.DriverID = CountAlias.DriverID[/tt]

So there should be an index that includes all 5 columns.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
George, this is exactly the sort of input I was hoping for. Thanks buddy, I knew you'd come through with the goods. I'll let you know how your suggestions work.
 
I'll let you know how your suggestions work.

I would appreciate that. It's always interesting to me when someone takes my advice and realizes a performance improvement. If you don't mind, I'm curious about the execution time before and after. I bet it will be rather dramatic.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Hi George,

I just wanted to update you on this project. Right after I got your very helpful information, I got put on a customer-facing (read billable) project that should take 2 or 3 weeks. If I can fit in a benchmark in the middle of that I will, but it might take that long to put your information to use. I will definitely follow through and publish the results here.
 
I understand, and I appreciate the heads-up. I hope things work out well for you.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Hi George,

I'm in the middle of setting this up, and have a quick question. Looking at your third example as a model: if there are two tables that are totals calculated, with their results placed into the same table, would it be better to do two update statements or have two left joins in the same update statement? Or does it matter? So should I do this:
Code:
UPDATE MUTUALTARGET 
SET BLAH = Alias1.STUFFFROMTABLE1,
    BLAH = Alias2.STUFFFROMTABLE2
LEFT JOIN (SELECT STUFF, MORESTUFF, ETC FROM TABLE1 GROUP BY ETC) AS Alias1
    ON BLAH = BLAH AND ETC
LEFT JOIN (SELECT STUFF, MORESTUFF, ETC FROM TABLE2 GROUP BY ETC) AS Alias2
    ON BLAH2 = BLAH2 AND ETC
or should I break up the two joins into two update statements?
 
It's hard to answer that. There are a lot of factors that would affect this. In a case like this, I would suggest that you try it both ways and see which is better.


-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Hi George,

I put in all the suggestions that you gave me. We're working with 1.8 million records in 5 tables, none of which even have a primary key, let alone an index. The original stored proc ran in 3:19, the new one in 1:00! I'd call that dramatic. Thanks again for your help.
 
You're welcome.

I suspect that indexes would take your 1 minute execution time down to sub-second execution time.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top