Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

how to optimize query 1

Status
Not open for further replies.

josephu12

Programmer
Nov 27, 2007
4
I have a query that works just fine in our test environment. I pointed to our production side though and it takes 517 seconds to execute. This will not work well for a Web Application don't you agree?

It seems that the main problem is in one of the tables, on test it only has about 50,000 records, in production it has well over a million. All joins are on indexed fields. We are using version 8.

I have used two different tools and one times out, the other is "patient" and lets it run to completion. I am not a DB2 person. But they say SQL is SQL don't they? I am in learning mode, I have checked various sites for two days now without much luck in how to speed this up.

Would an inner select help? How would I include it in this if it would?

I would apprectiate any help I can get, if more information is needed I will provide it.
Thanks!

SELECT DISTINCT
TC.CONT_NUM AS ContractNumber,
TC.CONT_STAT_CD AS ContractStatus,
TC.CONT_ISS_DEL_DT AS DeliveryDate,
PROD.TAPLAN.PLAN_TYP AS PlanType, PROD.TCAGT.AGENT_NAME AS AgentName, PROD.TAYBBASE.BILL_MODE_CD AS BillMode,
PROD.TAYBBASE.MODE_PREM_AMT AS PremAmtMode, PROD.TZAN.CLIENT_REFERENCE AS ClientReference,
PROD.TW1A0.FIRST_NAME AS ClientFName, PROD.TW1A0.LAST_NAME_COY_NAME AS ClientLName
FROM PROD.TZAN INNER JOIN PROD.TAYBBASE INNER JOIN
PROD.TCAGT INNER JOIN
PROD.TAPLAN INNER JOIN
PROD.TYCNT TC INNER JOIN
PROD.TAAGT TCA ON TC.CONT_NUM = TCA.CONT_NUM ON PROD.TAPLAN.PLAN_CD = TC.PLAN_CD
AND PROD.TAPLAN.TAX_QUAL_CD = TC.TAX_QUAL_CD ON PROD.TCAGT.AGENT_NO = TCA.AGNT_NUM ON
PROD.TAYBBASE.CONT_NUM = TC.CONT_NUM ON PROD.TZAN.CONNECT_ANNU_KEY = TC.CONT_NUM INNER JOIN
PROD.TW1A0 ON PROD.TZAN.CLIENT_REFERENCE = PROD.TW1A0.CLIENT_REFERENCE
WHERE (TC.CONT_STAT_CD = '09') OR
(TC.CONT_STAT_CD = '14') AND (CURRENT_DATE - TC.CONT_ISS_DEL_DT >= 0) AND ("DAYS"(CURRENT_DATE) - "DAYS"(TC.CONT_ISS_DEL_DT)<= 60)
AND (TC.TAX_QUAL_CD IN ('D', 'K', 'A'))
AND (PROD.TZAN.RELATIONSHIP = 'AN')
OR (TC.CONT_STAT_CD = '14')
AND (CURRENT_DATE - TC.CONT_ISS_DEL_DT >= 0)
AND ("DAYS"(CURRENT_DATE) - "DAYS"(TC.CONT_ISS_DEL_DT)
<= 60)
AND (TC.TAX_QUAL_CD NOT IN ('D', 'K', 'A'))
AND (PROD.TZAN.RELATIONSHIP = 'OW')
ORDER BY ContractNumber
 
can you show us the explain plan please.

And also the indexes of each table, and also a indication of what is the cluster ratio of each index.

Regards

Frederico Fonseca
SysSoft Integrated Ltd
 
Well, the tools that I have available (Visual Studio and DbVisualizer) do not show that info so I am somewhat limited in that regard. All I know is that the joins are on indexed fields.

There has to be a way that I can not include the "huge" table and later add its information to the results from the smaller query. I just don't know how.

Thanks
Joe
 
If you are working on Unix/Windows, then you will have a command db2expln. Look at its usage, and get used to use it at all times.

With Visual Studio (Assuming this is not MS one), you also should have a "visual Explain" option.


If working on a mainframe, you most likely have DB-Insight or Platinium instaled. If so they also have a explain table on their options. IF on a mainframe, but not licensed to use Insight/Platinium, then plain EXPLAIN on the SQL will give you the same. Look in the manuals if this is the case.


The fact you say the joins are on indexed fields does not mean they are being used. Please do supply the index information as I asked.

Regards

Frederico Fonseca
SysSoft Integrated Ltd
 
Joseph,
Have you mis-typed your where clauses as they appear to be incorrect as posted?

I've cut and pasted them to make them a little more readable below, and it seems that you have not got the brackets matched correctly. Also you appear to be querying fields multiple time for the same value e.g (TC.CONT_ISS_DEL_DT) <= 60 and TC.CONT_STAT_CD = '14'

Code:
WHERE (TC.CONT_STAT_CD = '09') 
   OR (TC.CONT_STAT_CD = '14') 
  AND (CURRENT_DATE - TC.CONT_ISS_DEL_DT >= 0) 
  AND ("DAYS"(CURRENT_DATE) - "DAYS"(TC.CONT_ISS_DEL_DT) <= 60) 
  AND (TC.TAX_QUAL_CD IN ('D', 'K', 'A')) 
  AND (PROD.TZAN.RELATIONSHIP = 'AN') 
   OR (TC.CONT_STAT_CD = '14') 
  AND (CURRENT_DATE - TC.CONT_ISS_DEL_DT >= 0) 
  AND ("DAYS"(CURRENT_DATE) - "DAYS"(TC.CONT_ISS_DEL_DT) <= 60) 
  AND (TC.TAX_QUAL_CD NOT IN ('D', 'K', 'A')) 
  AND (PROD.TZAN.RELATIONSHIP = 'OW')
 
SQL would be better written as.
Code:
SELECT DISTINCT
   TC.CONT_NUM AS ContractNumber,
   TC.CONT_STAT_CD AS ContractStatus,
   TC.CONT_ISS_DEL_DT AS DeliveryDate,
   PROD.TAPLAN.PLAN_TYP AS PlanType,
   PROD.TCAGT.AGENT_NAME AS AgentName,
   PROD.TAYBBASE.BILL_MODE_CD AS BillMode,
   PROD.TAYBBASE.MODE_PREM_AMT AS PremAmtMode,
   PROD.TZAN.CLIENT_REFERENCE AS ClientReference,
   PROD.TW1A0.FIRST_NAME AS ClientFName,
   PROD.TW1A0.LAST_NAME_COY_NAME AS ClientLName
FROM PROD.TZAN 
INNER JOIN PROD.TYCNT TC 
   ON PROD.TZAN.CONNECT_ANNU_KEY = TC.CONT_NUM 
INNER JOIN PROD.TAAGT TCA 
   ON TC.CONT_NUM = TCA.CONT_NUM 
INNER JOIN PROD.TCAGT 
   ON PROD.TCAGT.AGENT_NO = TCA.AGNT_NUM 
INNER JOIN PROD.TAYBBASE 
   ON PROD.TAYBBASE.CONT_NUM = TC.CONT_NUM 
INNER JOIN PROD.TW1A0 
   ON PROD.TZAN.CLIENT_REFERENCE = PROD.TW1A0.CLIENT_REFERENCE
INNER JOIN PROD.TAPLAN 
   ON PROD.TAPLAN.PLAN_CD = TC.PLAN_CD
  AND PROD.TAPLAN.TAX_QUAL_CD = TC.TAX_QUAL_CD 

WHERE     (TC.CONT_STAT_CD = '09') 

OR (TC.CONT_STAT_CD = '14') 
AND (TC.CONT_ISS_DEL_DT >= CURRENT_DATE) 
AND ("DAYS"(TC.CONT_ISS_DEL_DT) <= "DAYS"(CURRENT_DATE) - 60)
AND (TC.TAX_QUAL_CD IN ('D', 'K', 'A'))
AND (PROD.TZAN.RELATIONSHIP IN( 'AN', 'OW')

ORDER BY ContractNumber

The above is still without knowing what are the indexes. more could eventually be changed.

Regards

Frederico Fonseca
SysSoft Integrated Ltd
 
Thanks for all your suggestions, I will give them a try. Sad to say but the DBA that is on duty today can not give me the execution plan. (doesn't know how)

I have simplified this down to just two tables, one has 186,000 records, the other has 460,000 records. I do the select asking for records from 0 to 60 days old and a few other parameters and it is taking 32 seconds to run and returns 2700 records. If I add the client file in (well over a million records) the run time increases to over 500 seconds. I just am trying to pull the customers name from client.

With so little to go on I know it is a difficult task to assist me but would you say that this indicates that there are things on the database side that needs tuned?

I will try the other suggestions and post what I find out back here.
Thanks
 
it might just be your query. please supply the indexes as I asked. That will help.

The execution plan you should be able to do it yourself also. no DBA required for it!!! Look in the manuals for your specific operating system

Regards

Frederico Fonseca
SysSoft Integrated Ltd
 
Well I guess that I should mention that I use MS Visual Studio and have another tool that I use for queries, DbVisualizer.

I cut this query down to two tables, the join is on indexed fields. Between the two tables there are a total of about 700,000 records.

I am starting to feel the the database itself needs work to optimize it, perhaps the plan needs updated, stats or something. If a query that involves two queries take as long as 112 seconds it is leading me to beiieve that there is something else wrong.

Thanks for your suggestions though
 
You may be right about the state of the database. However, if the attached DBA has no clue on how to generate an explain plan, chances are that statistics may not be maintained as well..

You NEED this type of information to start optimizing the query!

Here is a good starting point:


I suspect the predicate:
Code:
("DAYS"(TC.CONT_ISS_DEL_DT) <= "DAYS"(CURRENT_DATE) - 60)

could be quite a performance killer (gut feeling)..

Ties Blom
 
Joseph,
If it is the adding of the client table that is causing it to go very slow, then I would target that table as the first place to look for an error. Is the client reference that you are using to access the table unique and does it have an index on that column?

If you are not sure of how to view the indexes try the following piece of SQL, tailored to your tablename.
Code:
SELECT T.NAME                      
     , K.IXNAME, K.COLNO, K.COLNAME
  FROM SYSIBM.SYSTABLES T          
      ,SYSIBM.SYSINDEXES I         
      ,SYSIBM.SYSKEYS K            
WHERE T.NAME = [COLOR=red]your table name in here[/color]    
  AND T.CREATOR = [COLOR=red]the creator of the table if more than one version of the table exists[/color]        
  AND I.TBNAME = T.NAME            
  AND I.TBCREATOR = T.CREATOR      
  AND K.IXNAME = I.NAME            
  AND K.IXCREATOR = I.CREATOR      
ORDER BY 2,3
 
WOW fredericofonseca, that's more readable. No-one supplied you with a purple start for all that work? I will!
The multiple "joins" followed by 1 "on" in the original question scared me away. Never knew that was allowed in the SQL syntax (and I will try to forget it a.s.a.p)
 
I believe frederico mishandled the transform. The statement
Code:
AND (TC.CONT_ISS_DEL_DT >= CURRENT_DATE)
is reversed. I think we're probably looking for TC.CONT_ISS_DEL_DT BETWEEN CURRENT_DATE and CURRENT_DATE - 60. Perhaps it should be something like:
Code:
AND DAYS(CURRENT_DATE) - DAYS(TC.CONT_ISS_DEL_DT) BETWEEN 0 AND 60

Glenn
 
there are two groups with current date on original query.
1- AND (CURRENT_DATE - TC.CONT_ISS_DEL_DT >= 0)

2- AND ("DAYS"(CURRENT_DATE) - "DAYS"(TC.CONT_ISS_DEL_DT)
<= 60)

first one can be converted to
(TC.CONT_ISS_DEL_DT >= CURRENT_DATE)
And second to
("DAYS"(TC.CONT_ISS_DEL_DT) <= "DAYS"(CURRENT_DATE) - 60)

Note that doing it this way means that 1 operation is made by the DB engine BEFORE executing the query. e.g. the right side of the compare, and a static value is then used on the compare. e.g only 1 operation per record.

It it was left on the left side, then an extra operation is required for each record being retrieved. e.g. 2 operations per record, unless the Engine itself determines that it can pass the operator to the right side, and does that before retrieving the rows.


Regards

Frederico Fonseca
SysSoft Integrated Ltd
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top