Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Why the "WITH" Format 1

Status
Not open for further replies.

grnzbra

Programmer
Mar 12, 2002
1,273
US
I have a SQL script which contains an insert statement in the following format:
Code:
	;WITH T1 AS
	(
		SELECT DISTINCT
			a bunch of fields
		FROM dbo.tblTable1 t1
		INNER JOIN dbo.tblTable t2
		ON t1.fldT1Key = t2.fldT1Key		
	)
	INSERT INTO dbo.tblTable3
	SELECT 
		a bunch of fields
	FROM T1
What is the reason/advantage of formatting an INSERT statement in this manner?
 
Just guessing here, I am sure the experts may have a better explanation.
It may be simply a personal preference.

Some info from here:
WITH said:
The syntax after the keyword [tt]WITH[/tt] is the same as it is for [tt]create[/tt] [a temporary] [tt]view[/tt]

You may get the same result with this approach:

[pre]
INSERT INTO dbo.tblTable3
SELECT DISTINCT
a bunch of fields
FROM dbo.tblTable1 t1
INNER JOIN dbo.tblTable t2
ON t1.fldT1Key = t2.fldT1Key
[/pre]

---- Andy

"Hmm...they have the internet on computers now"--Homer Simpson
 
So it seems like it is the equivalent of creating and loading a temporary table variable then using it as a source to load a real table. Does that seem about right?
 
That's how it looks to me.

In the essence, I see:

[pre]
WITH [red]T1[/red] AS
(
SELECT DISTINCT[blue]
a bunch of fields
including a kitchen sink[/blue]
FROM dbo.tblTable1 t1
INNER JOIN dbo.tblTable t2
ON t1.fldT1Key = t2.fldT1Key
)
INSERT INTO dbo.tblTable3
SELECT [blue]
just fields matching tblTable3 fields[/blue]
FROM [red]T1 [/red]
[/pre]

---- Andy

"Hmm...they have the internet on computers now"--Homer Simpson
 
In my view, there are several advantages of using a CTE (common table expression--that's what the WITH creates) rather than doing the data collection in a subquery of the insert:

1) Readability--you separate collecting the data you need from inserting it.
2) Allows more complex things--if you need a multi-step process that might involve a bunch of queries to get this data, CTEs are more flexible (and more readable--see #1).
3) Consistency--if you're using CTEs widely, then using them here makes sense.

In this case, if it's really just a simple join, there's not much to be gained. But as soon as things get complicated, the CTE shines.

Tamar
 
I guess, whoever wrote this statement above preferred that method.
Maybe that’s all what they knew (if all what you have is a hammer, everything looks like a nail [lol] ), or wanted to ‘show off’ their knowledge... [ponder]

---- Andy

"Hmm...they have the internet on computers now"--Homer Simpson
 
Andy and Tamar are correct, the CTE is unnecessary in this case.

The only thing I want to add is that the SQL Query optimizer will create exactly the same query plan either way, so there's nothing to be gained by changing it, either.

If I was writing this query, I would not have used a CTE because I think it is more difficult to read. If I saw this query written by someone else, and I knew it was fully tested and worked as desired, I would leave it alone.

Just my 2 cents.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
Thanks for the clarifications, everyone. Actually, in the script I'm dealing with there are seven tables involved with two being the same table joined on nine fields. So I guess whomever wrote this knew this would be more efficient.
At what number of tables does this begin to have a significant advantage?
 
It's not about the number of tables. It's about the complexity of what you're doing. With CTEs, you can easily nest stuff and, most importantly, if there's an intermediate table you need in multiple subsequent processes, you can have the query there only once.

Here's an example of a query (using AdventureWorks) where one CTE is used in multiple subsequent CTEs:

[pre]WITH SalesByYear (SalesPersonID, SalesYear, SubTotal)
AS
(SELECT SalesPersonID, YEAR(OrderDate), SubTotal
FROM Sales.SalesOrderHeader
WHERE SalesPersonID IS NOT NULL),

SalesTotal
AS
(SELECT SalesPersonID, [2011] AS Total2011, [2012] AS Total2012, [2013] AS Total2013, [2014] AS Total2014
FROM SalesByYear
PIVOT(SUM(SubTotal)
FOR SalesYear IN ([2011], [2012], [2013], [2014])) AS TotalSales),

SalesCount
AS
(SELECT SalesPersonID, [2011] AS Count2011, [2012] AS Count2012, [2013] AS Count2013, [2014] AS Count2014
FROM SalesByYear
PIVOT(COUNT(SubTotal)
FOR SalesYear IN ([2011], [2012], [2013], [2014])) AS Sales)

SELECT ST.SalesPersonID,
SC.Count2011, ST.Total2011,
SC.Count2012, ST.Total2012,
SC.Count2013, ST.Total2013,
SC.Count2014, ST.Total2014
FROM SalesTotal ST
JOIN SalesCount SC
ON ST.SalesPersonID = SC.SalesPersonID
ORDER BY ST.SalesPersonID;[/pre]


Tamar
 
two being the same table joined on nine fields" - sounds like my world for last 20+ years [banghead]

---- Andy

"Hmm...they have the internet on computers now"--Homer Simpson
 
As Tamar said, CTE allow you to build a complex queries step by step from simple queries. You can test the individual queries separately to see whether they deliver the desired results.
It's like a bottom-up approach in the programming: building a program from individual, separately tested functions.
IMHO, the CTE make the creation of complicated queries much easier and make the SQL code more readable and testable.
 
Thanks for the clarification. I didn't realize that the object created by the WITH statement persisted; I thought it would be a one-time deal.

The reason for the same table joined to itself on nine fields is an attempt to determine if a particular item that is listed in one office also exists in a second office by matching on nine characteristics of that item.
 
It is a one-time thing, except that it exists for the rest of that whole query constellation. So in the example I posted, we create SalesByYear first and then both SalesTotal and SalesCount can use it.

Without the CTE, you'd have to put the SalesByYear subquery in both the SalesTotal and SalesCount subqueries. That leads to the possibility of changing one, but not the other, and not getting the results you expect.

Tamar
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top