Queries with SELF join is slow,though the PLAN is very good 1

Db2curiosity · Apr 21, 2005

Hi,
I wrote last week about some costly queries and kind of tuned them by adding a NEW index with all your valuable help.

Looks like the story didn't end here. There are another set of queries [which is having SELF JOIN on a 25 MM table]Its picking up the NEW index,showing improvement in Explain Plan and Cost but the actual time taken is almost the same.

I'm giving the tables,query,plan cost and index structure that its picking up.

Please let me know if you have any new idea of tuning this specific set of queries. Any change that can speed up.

Table X has 25 MM rows (b and b1 are alias of X)
Table Y has 75k

select a.SOURCE_SYSTEM,a.COMPANY_CODE,a.SEPARATE_ACCT,a.MARGIN_CODE,a.FUND_NUMBER,a.VALUATION_DT,
a.PROCESS_DT,a.AGREEMENT_ID,a.FINCL_ASSET_ID,b.ACCOUNT_ID,b1.ACCOUNT_ID,'5',a.PLAN_NUM,
a.PART_NUM,a.FUND_SRC,a.FUND_IV,a.char_AGMT_ID,a.char_FINCL_ID,a.HIDDEN_FL,
char(a.VALUATION_DT)
from Y a,X b,X b1
Where b.SOURCE_SYSTEM_nm='DOUG'
and a.CHAR_AGMT_ID=b.LOAD_PROCESS_KEY_PART_01
and a.CHAR_FINCL_ID<>b.LOAD_PROCESS_KEY_PART_02
and a.FUND_IV=b.BUSINESS_KEY_PART_04
and char(a.VALUATION_DT)>=b.BUSINESS_KEY_PART_08
and a.VALUATION_DT<=coalesce(b.DELETE_DT,'9999-12-31')
and b1.SOURCE_SYSTEM_nm='DOUG'
and a.CHAR_AGMT_ID=b1.LOAD_PROCESS_KEY_PART_01
and a.CHAR_FINCL_ID=b1.LOAD_PROCESS_KEY_PART_02
and a.FUND_IV=b1.BUSINESS_KEY_PART_04
and (b.DELETE_DT=date(b1.BUSINESS_KEY_PART_08) or
(b.DELETE_DT + 1 day)=date(b1.BUSINESS_KEY_PART_08))

CREATE INDEX DB2DBA.NEW_IDX_ON_X
ON DB2DBA.X
(SOURCE_SYSTEM_NM ASC,
LOAD_PROCESS_KEY_PART_01 ASC,
LOAD_PROCESS_KEY_PART_02 ASC,
BUSINESS_KEY_PART_04 ASC,
BUSINESS_KEY_PART_08 ASC,
ACCOUNT_ID ASC,
DELETE_DT ASC
)

The plan & cost :-

Step Operation Object name Object Cost
---- ---------------------------- ------------------- ------------
1 Index Scan (No Table Access) DB2DBA.NEW_IDX_ON_X Index 220347.13
2 Table Queue 223945.03
3 Index Scan (No Table Access) DB2DBA.NEW_IDX_ON_X Index 220347.13
4 Table Queue 224524.34
5 SHIP GLOB.Y NK 4008.26
6 Sort 4156.59
7 Sorted Table Data Scan 4165.79
8 Filter Rows 4195.11
9 Merge Scan Join 245020.47
10 Temporary Table Construction 245020.48
11 Temp Table Data Scan 245020.53
12 Nested Loop Join 524686.13
13 Data Operation Complete 533043.44

By doing some minor tweaks the last 3 steps can be altered to the following,but is that of any help ? I mean i'm not very sure if HASH JOIN is better in DB2 or NESTED LOOP JOIN

10 Sort 245020.41
11 Sorted Table Data Scan 245020.41
12 Hash Join 1095881.00

INSTEAD of

11 Temp Table Data Scan 245020.53
12 Nested Loop Join 524686.13
13 Data Operation Complete 533043.44

MarcLodge · Apr 21, 2005

Hi Curious,
a couple of questions:

Can you see from the explain how many of the index fields in b and b1 are actually being used?

You are coalesce-ing on DELETE_DT in B to give the date or '9999-12-31' if DELETE_DT is null. Fine, but then you use B.DELETE_DT to join to B1 on BUSINESS_KEY_PART_08. Doesn't that mean that you are only interested in a non null DELETE_DT? If so, try removing the coalesce and replacing with DELETE_DT not null.

Is the BUSINESS_KEY_PART_08 always going to contain a date? If so can it be defined as a date field? This would speed things up substantially as every time DB2 obtains this column it is having to apply the DATE function to it. What happens by the way if the column does not contain a date? Do you get a -180?

Sorry for the questions, but hopefully they might move us along the path a little bit.

Marc

Mercury2 · Apr 22, 2005

Could JOIN be better than WHERE predicate to speed up things ?

Db2curiosity · Apr 22, 2005

Marc,

yeah the first one is a great suggestion,though the COST
in the db2expln plan doesn't change that much , i can see
improvements while actually running the altered query.

BUSS_KEY_PART_08 may not contain date always..no we dont get -180, what actually u are meaning by that ?

Thanks,

MarcLodge · Apr 22, 2005

Hi Curiosity,
Normally if you run a DATE function against a non-date, you get an error. In the mainframe environment where I use DB2 this error is reflected in an SQLCODE of -180, hence my odd sounding question.

In looking at the query from a performance angle, I noticed that you were asking DB2 to perform a DATE function against every occurrence of B1.BUSINESS_KEY_PART_08 that it interrogated, which led me to wonder if this always contained a date (if it did, I was going to suggest changing the format to DATE). This line of thought led me to question what DB2 would do if it encountered a non date in the column it is attempting to run the DATE function against. I would have thought that it might complain and possibly error, but if you're having no problems in this respect, then please ignore my hypothesising.

Can't think of anything else that you could do to make this run any quicker.

Marc

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Queries with SELF join is slow,though the PLAN is very good 1

Db2curiosity

Programmer

MarcLodge

Programmer

Mercury2

Technical User

Db2curiosity

Programmer

MarcLodge

Programmer

Similar threads

Part and Inventory Search

Sponsor