Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Weird Execution Plan in SQL 2000

Status
Not open for further replies.

ESquared

Programmer
Dec 23, 2003
6,129
US
I have a table with 62,037,506 million rows in it, call it BigTable. I am investigating a column, call it SequenceNumber. I just want to see some typical sequences. So I run a query:

Code:
select top 100 * from BigTable order by ParentID, SequenceNumber
This takes forever and I finally kill it. Weird, I think. Look at the indexes on this table:

[tt]BigTableAI01 - nonclustered - CustomerID
BigTableAI02 - nonclustered - ParentID, Active
BigTableAI03 - nonclustered - ValidValueID, Active, ParentID
BigTableCI - clustered - DataType, ParentID, ValidValueID, SequenceNumber
BigTablePK - nonclustered, unique, primary key - BigTableID[/tt]

What's the problem? Read the 02 index to get the top rows ordered by ParentID (how the sequencenumbers correlate), do a bookmark lookup to get the rows, and you're done.

But the estimated execution plan for the query looks like this:

[tt] |--Top(100)
|--Parallelism(Gather Streams, ORDER BY:([BigTable].[ParentID] ASC, [BigTable].[SequenceNumber] ASC))
|--Sort(TOP 100, ORDER BY:([BigTable].[ParentID] ASC, [BigTable].[SequenceNumber] ASC))
|--Clustered Index Scan(OBJECT:([REPO].[dbo].[BigTable].[BigTableCI]))[/tt]

Scanning all 62 million rows of the clustered index. Yuck.

Okay, so I try this:

Code:
select SequenceNumber, * from (
   select top 100 * from BigTable order by parentID
) x order by parentID, SequenceNumber
[tt] |--Sort(ORDER BY:([BigTable].[ParentID] ASC, [BigTable].[SequenceNumber] ASC))
|--Top(100)
|--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([REPO].[dbo].[BigTable]) WITH PREFETCH)
|--Index Scan(OBJECT:([REPO].[dbo].[BigTable].[BigTableAI02]), ORDERED FORWARD)[/tt]

Ah... 100 rows on that index scan. That's what I was expecting in the first place. I know the last ParentID may not have all its associated seqnums in order, but who cares. That seems a very minor detail that the engine could correct. Something like, get 100 rows plus any more that share the same ParentID of the last row, then sort with SequenceNumber, and throw away the extras.

So I thought I'd try to force the engine to use that index I want:

Code:
select top 100 * from BigTable WITH (INDEX (BigTableAI02))
order by parentID, SequenceNumber
[tt] |--Top(100)
|--Parallelism(Gather Streams, ORDER BY:([BigTable].[ParentID] ASC, [BigTable].[SequenceNumber] ASC))
|--Sort(TOP 100, ORDER BY:([BigTable].[ParentID] ASC, [BigTable].[SequenceNumber] ASC))
|--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([REPO].[dbo].[BigTable]) WITH PREFETCH)
|--Index Scan(OBJECT:([REPO].[dbo].[BigTable].[BigTableAI02]))
[/tt]
But that index scan is reading all 62 million rows, not just the 100 of the previous query!

Can anyone explain this? And can you give me a query that performs like the derived table one, but doesn't require a derived table?
 
I wish I could help. Maybe a bump is all you need.

-George

"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
I don't think without having an index with ParentID ASC and SequenceNumber ASC you will get any performance out of it. The fact that SequenceNumber is in there to me says
BigTableAI02 - nonclustered - ParentID, Active
will kill you and just never get what you could out of an index specifically for your task.

That is just what I would do without really spending a lot of time banging my head on it. If I had those requirements I'd create the index before really ever running the statement

[sub]____________ signature below ______________
Backups are for sissies!!!!
coming to your keyboards soon[/sub]
 
Why do you think it is weird?
Your order by clause is not covered by an index

SequenceNumber is the 4th column in the BigTableCI index so the optimizer has to scan the whole index to do the ordering

it also doesn't help that ParentID is the 2nd column in that index

for example if your index is ix_1(ParentID ,SequenceNumber )

and your where clause is
where SequenceNumber =1
you will get a scan

but if your index was
ix_1(SequenceNumber,ParentID )

then you would get a seek





Denis The SQL Menace
--------------------
SQL Server Code,Tips and Tricks, Performance Tuning
SQLBlog.com, Google Interview Questions
 
I'll have to dito what Denis and onpnt have posted.

It's just shorter that way then retyping it all out.

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

My Blog
 
Just to be clear, I know that the "answer" is that SQL Server's query engine requires an index to cover the order by clause in order to use it properly.

What I'm saying is, it seems like it doesn't have to be that way, since I can modify the query to get the execution plan I want (one that gives the exact same results but WAY faster). If I can do that, why can't Microsoft?

So why does it pick a full index scan for no reason? Can you think of another way to get it to use the more efficient execution plan?

Look:

Code:
if object_id('child') is not null drop table child
if object_id('parent') is not null drop table parent
go
create table parent(i int not null primary key clustered)
create table child (
   childid int identity(1,1) primary key nonclustered,
   parentid int not null foreign key references parent(i),
   seqnum int not null default(1),
   active bit not null default(1),
   datatype int not null,
   otherintcol int not null,
   padding varchar(7000) null,
)

insert parent select num from numbers where num <= 500

insert child (parentid, seqnum, active, datatype, otherintcol, padding)
select p.i, n.num, 1, 501 - n.num, 100 - p.i % 100, left(replicate(p.i, 7000), 7000)
from
   parent p
   inner join numbers n on n.num <= p.i

create unique clustered index ix_child_clustered on child(datatype, parentid, otherintcol, seqnum)
create nonclustered index ix_child_noncover_orderby on child(parentid, active)

-- turn on show execution plan for these now
select top 100 * from child order by parentid

select top 100 * from child order by parentid, seqnum

select top 100 * from (
   select top 100 with ties *
   from child
   order by parentid
) x
order by parentid, seqnum

See that the third query's execution plan is very similar to the first query's execution plan, but gives the same results as the second. The "with ties" is the essential piece that makes it all work.
 
We all know that SQL doesn't always pick the right execution plan.

Stats could be out of wack, stats sampling ratio may not be optomial, etc.

I have a query which SQL loves to generate a bogus plan for. I ended up having to put the plan manually into the database query to ensure that the correct plan was used. Because of it SQL shows my page life expectancy is crap when it's actually not, which is why SQL was selecting the wrong plan. Fornitually that system is SQL 2005 so I have that as an option.

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

My Blog
 
Hmm I didn't know the page life expectancy performance counter. Would you explain more about how your ple is showing as poor when it shouldn't be, and why?

And also back to my question, do you see what I'm talking about? An execution plan that scans 16 million rows instead of a couple of hundred is just plain lunacy.
 
According to perfmon my page life expectancy is 0 seconds. But when I query the buffer cache DMVs the bulk of the data is in there statically. From what I've been able to figure out the page life is 0 because the data pages for this table (~300 M records) can't fit into the buffer (only have 8 Gigs at the moment, tried upgrading to 16, but the server crashed after the upgrade). All the other objects are staying in the cache for a very long time. Our data reads are fairly low, but I'll live with the 0 second life considering that the query takes less than 2 seconds now, but was taking hours with the SQL generated plan.

What do your stats look like? What percentage of data is coming into the table compared to the current total number of records in the table?

When you look at the stats what does the spread of records per interval look like? You may need to adjust some of the statistics settings to get SQL to behave properly.

Don't forget, it's a computer. It's basically stupid. All it knows is what the statistics tell it. If they are giving it sequed (sp) information then it's query plan will be wildly incorrect.

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

My Blog
 
My little test code I created above gives the exact same results. You can see them yourself, just copy and paste and run it. Do you think it's truly statistics problems that are going on? I have the feeling that it's not statistics.
 
It's probably not stats, since the tables are small in the example.

I'd submit it to connect and see what they say.

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

My Blog
 
run this for a second

Code:
select top 100 parentid, seqnum from child 
order by parentid, seqnum

see how fast that is? do you need all the other columns?

run these two queries, one with, one without the padding column

Code:
select top 100 parentid, seqnum, active, datatype, otherintcol from child 
order by parentid, seqnum
see how fast that is

Code:
select top 100 parentid, seqnum, active, datatype, otherintcol,padding from child 
order by parentid, seqnum
this one will take forever


both queries produce a scan but the index that the optimizer picked is different

Code:
  |--Top(100)
       |--Parallelism(Gather Streams, ORDER BY:([child].[parentid] ASC, [child].[seqnum] ASC))
            |--Sort(TOP 100, ORDER BY:([child].[parentid] ASC, [child].[seqnum] ASC))
                 |--Index Scan(OBJECT:([Eric].[dbo].[child].[ix_child_noncover_orderby]))


  |--Top(100)
       |--Parallelism(Gather Streams, ORDER BY:([child].[parentid] ASC, [child].[seqnum] ASC))
            |--Sort(TOP 100, ORDER BY:([child].[parentid] ASC, [child].[seqnum] ASC))
                 |--Clustered Index Scan(OBJECT:([Eric].[dbo].[child].[ix_child_clustered]))
as you can see adding the padding column makes all the difference


>>since I can modify the query to get the execution plan I want (one that gives the exact same results but WAY faster). If I can do that, why can't Microsoft?

because what is faster for one query might be slower for other, remeber plans are cost based not time based

Denis The SQL Menace
--------------------
SQL Server Code,Tips and Tricks, Performance Tuning
SQLBlog.com, Google Interview Questions
 
Denis,

Thanks for posting.

I did start thinking yesterday about the * vs returning only a few columns. But yes, I needed the *, this is not a production query but for me to understand the structure and data of a table. So in that sense the execution plan doesn't matter, but simply opened a can of worms I wanted to understand better.

Okay I see what you're talking about, how the optimizer, when not confronted with the padding column, chooses a better execution plan. But the difference is not just the index chosen, it's also the use of that index. Compare your fast query with a new one:

Code:
--your no padding query
select top 100 parentid, seqnum, active, datatype, otherintcol from child 
order by parentid, seqnum

  |--Sort(TOP 100, ORDER BY:([child].[parentid] ASC, [child].[seqnum] ASC))
       |--Index Scan(OBJECT:([Erik].[dbo].[child].[ix_child_noncover_orderby]))

--new query with padding, forced to use same index
select top 100 parentid, seqnum, active, datatype, otherintcol,padding from child with(index(ix_child_noncover_orderby))
order by parentid, seqnum

  |--Sort(TOP 100, ORDER BY:([child].[parentid] ASC, [child].[seqnum] ASC))
       |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([Erik].[dbo].[child]) WITH PREFETCH)
            |--Index Scan(OBJECT:([Erik].[dbo].[child].[ix_child_noncover_orderby]))
The second query goes away and "never" comes back (I stop the query after sql server has consumed 700 MB of memory). I presume because it is doing a bookmark lookup for every row, THEN performing top on it. I can see why that would be expensive. So don't do that. Get enough rows that you'll satisfy the TOP predicate for sure (as my WITH TIES query does) THEN do a bookmark lookup for just those rows. Don't do a bookmark lookup on the entire table.

As for your last comment, doesn't scanning 62 million rows cost more than scanning 200 rows, not even considering execution time?

Also, Alex pointed out something interesting. Since a table scan can be more efficient than a seek when pulling back a certain number of rows, perhaps the optimizer is choosing that because of my TOP 100 and a smaller amount would change it. I'll try to look into that later.
 
ESquared,
I wanted to let you know that I'm swipping your sample code above for use in a talk on query tunning that I'm giving at the Microsoft Code Camp next weekend.

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

My Blog
 
Denny, Do you know if the lecture will be recorded?

[sub]____________ signature below ______________
Backups are for sissies!!!!
coming to your keyboards soon[/sub]
 
That's very interesting! Please let us know if anything interesting comes from it.
 
I am not aware if it is or not. It's the free So Cal code camp that is being put on here, so I'm thinking probably not.
<Shameless plug>
There's a link and details on my blog.
</Shameless plug>

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

My Blog
 
Just for you guys I'm working with Quest Software to see if they can get a camera crew out there to tape the sessions (I've done some writing for them in the past so I've got a good rep with them). If they can find an available camera guy and sound guy the'll send them and record my sessions and post them up on thier site.

Stupid me didn't think about having them record it sooner, but hey it's been a busy week.

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

My Blog
 
Quest couldn't find a camera crew on such short notice. They've asked me to come in sometime in the next month or two in order to record them in there office instead. I've published my slide deck and sample code for all four sessions on my blog. Feel free to grab them.

Denny
MCSA (2003) / MCDBA (SQL 2000)
MCTS (SQL 2005 / Microsoft Windows SharePoint Services 3.0: Configuration / Microsoft Office SharePoint Server 2007: Configuration)
MCITP Database Administrator (SQL 2005) / Database Developer (SQL 2005)

My Blog
 
Denny,

I'm flattered that you used this discussion in yours.

Erik
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top