Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

runstats slows down query

Status
Not open for further replies.

nukular

Technical User
Sep 26, 2004
7
CH
hi everyone

i'm doing a work on optimizing db-queries.

my test-db has the following tables:
create table r (a int, b int);
create table s (a int, d int, e int),
create table t (d int, f int);

these are filled with 1 million pseudorandom numbers.

my query is as follows:
select *
from r,s,t
where r.a=s.a and s.d=t.d;

with no optimization the query has about 30s.
after i inserted an index on r(a) and executed runstats on r, the explain plan gives me a result which uses factor 200'000 less timerons. the query whatsoever has around 8 min to deliver a result.

does anybody know what i do wrong, or what the promlem is?
thanks
 
Just running statistics one one of the participating tables may be a cause. I'd suggest creating statistics for all tables and then re-test.

(I once ran statistics on a database that was supposed to support an application WITHOUT keeping statistics up-to date. Lots of processes came almost to a halt and I got a big reprimand on that occasion)

Ties Blom
Information analyst
tbl@shimano-eu.com
 
Hi All,
Ties is partially correct. You do need to do a RUNSTATS for each table. But with DB2 you also need to do a RUNSTATS for each index. Try that and run your test again and please report back to us the results.

Then I'd like you to try another test or two.

Drop the index on table R [and any other indexes].

Add these three indexes:

Create index rx1 on r(a,b);
Create index sx1 on s(a,d,e);
Create index tx1 on t(d,f);

Runt the test and please report back.

Then do RUNSTATS on all three indexes, run the test and again report back.

Finally, create the following index:
Create index sx2 on s(d,a,e);

Run RUNSTATS on this index and run the test again [this may or may not run faster].

The DB2 Optimizer is very powerful and complex. It is best if you read the ADMINISTRATION GUIDE for the platform and version you are using. Look in the Performance section.

If your firm is working with million row [plus] tables and does thousands of transactions, queries and reports per hour, then you may want to attend a class that I teach "Cost-Saving Database Index Design" [ in the USA see DBIndexDesign.com; in Europe see Tapio1 do com ]. At my last class a DBA was trying to design indexes for a Star-schema of a 3 BILLION row fact table.

Also, "nukular", could you post the code that you used to generate the rows with pseudo-random numbers? It would be helpful to others.

Larry Kintisch 9/27/04

IBM Contractor Instructor for DB2, SQl, QMF, Data modeling; [DB2 for z/OS]
 
Yes Larry, you are right.

There is an excellent introduction to generating random numbers in Graeme Birchall' s cookbooks for DB2 involving the rand() and generate_unique() functions.

Be aware however that TRULY random output is one of the biggest mindcrackers that mathematics has faced....



Ties Blom
Information analyst
tbl@shimano-eu.com
 
thanks for your answers.

Larry: I tried what you suggested. I'm affraid the results weren't very promising. on the contrary, the queries sometimes took even longer to produce the results.
But thank you anyway.

I then looked at the sizes of the tables and indexes, and realized that the indexes need more space then the actual table. Is it possible, that the optimizer doesn't realize that, uses the index and as result has longer to deliver the answer? I can't think of any other explanation for this phenomenon.

Your question regarding the random numbers: We wrote a java programm, which generates random numbers, and through jdbc inserts them into the database.

Lukas Felder
 
Indexes will be usefull if you need part of the table. However if the optimizer is stimulated to use indexes and then has still to access the whole table it may take more time overall than full table scans anyway.

A real life query will never perform a "Select * from...."
but will instead use a where clause to select certain ranges or single values.

Did you test queries as such?

Ties Blom
Information analyst
tbl@shimano-eu.com
 
No

The query I mentioned above is the only one i've run so far.

I overthought what I wrote earlier, and am not very convinced with that. I don't believe anymore, that the query takes longer because it's larger. It only started to take longer as soon as I used RUNSTATS. If I hadn't used it, I wouldn't be having this problem.

So I haven't had any luck so far and am still very grateful for help.

Lukas Felder
 
Hmmm,

Did you alter the dft_queryopt setting?

This is default set at 5 for mixed database-use. Setting it to 1 will allow the optimizer a minimum of time.

Larry may have better suggestions I guess, as a index expert (like getting rid of the statistics again)

Ties Blom
Information analyst
tbl@shimano-eu.com
 
No I haven't tried altering that table.

I was told that the optimizer miscalculates himself and i should look at the system tables. Which system tables do I have to look at to find a solution?

Lukas Felder
 
Well,

dft_queryopt is a database setting, not a table one...

I do not know from top of my hat which systemtables are involved, but runstats will issue a SQL against the tables. Capture the SQL and you know where to look.

I usually use the free DB2 monitor tool from Dmitri Liakh to capture running SQL's:


possibly it is no longer free, check it out...

Ties Blom
Information analyst
tbl@shimano-eu.com
 
Lukas,
Here's my guess. Before the RUNSTATS DB2 probably saw "-1" values in the catalog statistics columns and assumed a smallish table and would do a few table scans. You were using short rows so roughly 250 rows per page meant you had about 4000 pages each. A modern server could probably read those pages sequentially in about 4 seconds or less! So the three table scans would take about 12 seconds [of the 30 sec. you started with] to read the pages into the bufferpool [main memory, "B-P] plus at least one and maybe two workfiles would be created that "might" have required writing or reading. A very smart DB2 would create those small workfiles in the bufferpool. Once in the B-P all join operations would be performed in memory at microsecond speed, perhaps sorting, perhaps looping. Your explain tells you how the join worked.
Timerons are just guesses for each possible access path based on the optimizer's assumptions of number of rows, number of values, and clustering. The path with the fewest timerons ["cost'] is chosen.
The cost of using your R index after RUNSTATS was that the optimizer used it! And row one in the R index key had the random number of say "0002" and it did a random read [about .010 seconds, 10ms.] into the page of the table R to find the value of column B in the table row, say "2034".
One path choice "I" follows. With no index on table S it started the sequential reading of all pages into the buffer pool [4 seconds as above] looking for the value of "2034". The second row of index R produced say "0004" and it repeated the scan of table S [now in memory] say 1,000,000 times .00000001 sec./row {.010 microseconds, .010 us} or about .01 seconds. Eventually all the random reading of table R [4000 of them times .010 sec] was finished or 40 seconds. But the millions of sequential scans of table S [and later table T] would be very costly [perhaps over two hours].
A different path coice "II" is this. Probably DB2 sorted table R into column A sequence and sorted table S into A sequence [maybe 8 seconds each] and did the join into a temporary workfile, sorted it into D sequence, read and sorted T into D sequence [say another 8 seconds] and joined into another work file producing the unsorted result. That was your original 30 seconds.

The actual path choice is some combination of the two, hence 480 seconds [8 minutes]. Other things may be happening. Your bufferpools may not be big enough so the virtual memory B-P pages are being swapped in and out. The sort heaps may not be big enough.

The indexes I suggested for you should not have accessed the tables ["index only" access] but randomly probing the index was probably too costly. Try all of the alternatives with a WHERE clause: WHERE R.A < 1000 AND T.D = 12345 and see the timings.

Tuning a production systems is quite an art!

Last comment per that of Ties concerning what is "usually done" in a join. In a records store if the tables were for Records, Songs and Tracks [R,S,T] the relationships would be 15000 R, each has about 20 T and each T is one of about 25,000 S. The join might be:
SELECT R.RKEY, R.RTITLE, T.TTRK, T.TARTIST, T.TLENGTH, S.STITLE
FROM S INNER JOIN T ON S.SKEY = T.TSKEY
INNER JOIN R ON T.RKEY = R.RKEY
WHERE S.STITLE = 'STRAWBERRY FIELDS FOREVER'
ORDER BY T.TARTIST, R.RTITLE

With proper indexes and a few hundred table and index reads the query could return the sorted results in a few seconds.

If there was only one row matching, and if indexes are partly in the B-P it could take about 10 ms. x 3 random reads [one for each index] [no table reads] or about .030 seconds! This is what is taught in the "Cost-Saving Database Index Design" class I teach in the USA and Tapio teaches in Europe.

Larry Kintisch

IBM Contractor Instructor for DB2, SQl, QMF, Data modeling; [DB2 for z/OS]
 
I checked everything I could and now think that I know what the reason is, the query suddenly took so long.

Before I created any indexes, the optimizer uses a NLJOIN for R and S and a MSJOIN for the result of the NLJOIN and and T. For these joins it sorted S on D and didn't have to sort the result again to merge sort it with T.
After I created the statistics on the tables and indexes that changed. R and S still were joined with a nested loop join, with the difference, that S was sorted on A, and therefore the result of that join hat to be sorted on D later for the MSJOIN with T. And this is were the optimizer made a false assumption. It calculated the NLJOIN without statistics to 5'000'000'000 timerons, and the one with statistics to 170'000. Of course it then had to take the second version. But when you execute the query, that doesn't come true. The two sorts must take much longer then the first version.

After finding this solution my question is, what are the criterias for the optimizer on what attribute it sorts?

Hope you can help me.
Thanks anyway.

Lukas Felder
 
Lukas,
The development of the DB2 optimizers [z/Os, Win-Unix, AS/400] has totaled thousands of person-years. They are cost based, so part of the "smarts" of the optimizer is estimating percentages of the table rows qualifying so that it can estimate how much "work" it will have to do with each practical path with each method.
What I tried to show you in the last of my comments is that it is VERY complex. Hat's off ot the IBM development teams! Using estimating techniques that Tapio and I teach you can see [and especially with real WHERE clauses] how long how long access via column A or column D would take; in fact CPU's with large enough bufferpools and "sort pools" do their sorting so fast that his techniques ignores the cost of sorting.
For more information about DB2 and to search for other similar questions and answers actively watched by a very large world-wide group [IDUG] see: You can sign up to receive 50 e-mails a day! Or just see the threads or search the archives. I know that IBM lab developers actively participate to "DB2-L".
Also see IBM's developer works:
Larry Kintisch
IBM Contractor Instructor for DB2, SQl, QMF, Data modeling; [DB2 for z/OS]
 
I've finally solved my problem.

If I change the optimization level to 0, the query is finnished in 8 seconds.

Thanks everyone for the help.

Lukas Felder
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top