Performance problem on Teradata SQL query

zanzemaj · Nov 2, 2006

If anyone could assist me in making this query run any faster, I would really appreciate it. Any comments or recommendations? I'm new to Teradata and been using Oracle for 8 years.
Here is Explanation
1) First, we lock a distinct stage_dev."pseudo table" for read on a RowHash to prevent global deadlock for stage_dev.upc.
2) Next, we lock time_dev.lu_day for access, we lock
time_DEV.lu_week for access, we lock store_dev.lu_store for access, we lock other_dev.deli_capital_prfrm_tgt for access, we
lock stage_dev.div_item_day for access, and we lock
stage_dev.upc for read.
3) We execute the following steps in parallel.
1) We do a single-AMP RETRIEVE step from time_dev.lu_day by way of the unique primary index "time_dev.lu_day.d_date =
(DATE )- 6" with no residual conditions into Spool 3 (group_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs. The size of Spool 3 is estimated with high confidence to be 1 row. The estimated time for this step is 0.01 seconds.
2) We do an all-AMPs RETRIEVE step from
other_dev.deli_capital_prfrm_tgt by way of an all-rows scan with a condition of (
"other_dev.deli_capital_prfrm_tgt.launch_dt <= DATE") into Spool 4 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 4 by row hash. The size of Spool 4 is estimated with no confidence to be 1,204 rows. The estimated time for this step is 0.01 seconds.
4) We do an all-AMPs JOIN step from time_DEV.lu_week by way of a RowHash match scan with a condition of (
"time_DEV.lu_week.week_end_dt <= DATE"), which is joined to Spool 4 (Last Use) by way of a RowHash match scan. time_DEV.lu_week and Spool 4 are joined using a merge join, with a join condition of "(time_DEV.lu_week.week_end_dt >= launch_dt) AND (time_DEV.lu_week.week_id = week_id)"). The result goes into Spool 5 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 5 by row hash. The size of Spool 5 is estimated with no confidence to be 402 rows. The estimated time for this step is 0.01 seconds.
5) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step from Spool 5 (Last Use) by way of a RowHash match scan, which is joined to
store_dev.lu_store by way of a RowHash match scan with no residual conditions. Spool 5 and store_dev.lu_store are joined using a merge join, with a join condition of (
"store_dev.lu_store.store_id = store_id"). The result goes into Spool 6 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 6 by row hash. The size of Spool 6 is estimated with no confidence to be 402 rows. The estimated time for this step is 0.01 seconds.
2) We do an all-AMPs RETRIEVE step from time_dev.lu_day by way of an all-rows scan with a condition of ("NOT (time_dev.lu_day.week_id IS NULL)") into Spool 7 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 7 by row hash. The size of Spool 7 is estimated with high confidence to be 5,117 rows. The estimated time for this step is 0.02 seconds.
6) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of a RowHash match scan, which is joined to Spool 7 (Last Use) by way of a RowHash match scan. Spool 6 and Spool 7 are joined using a merge join, with a join condition of ("(week_id = week_id) AND (week_id = week_id)"). The result goes into Spool 2 (all_amps), which is redistributed by hash code to all AMPs. The size of Spool 2 is estimated with no confidence to be 2,814 rows. The estimated time for this step is 0.09 seconds.
2) We do an all-AMPs RETRIEVE step from time_DEV.lu_week by way of an all-rows scan with a condition of ("time_DEV.lu_week.week_end_dt <= DATE") into Spool 10 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 10 by row hash. The size of Spool 10 is estimated with no confidence to be 244 rows. The estimated time for this step is 0.01 seconds.
3) We do an all-AMPs RETRIEVE step from time_dev.lu_day by way of an all-rows scan with a condition of ("(time_dev.lu_day.d_date <= ((DATE )- 6 )) AND (NOT (time_dev.lu_day.week_id IS NULL ))") into Spool 11 (all_amps) (compressed columns allowed), which is
redistributed by hash code to all AMPs. Then we do a SORT to order Spool 11 by row hash. The size of Spool 11 is estimated with low confidence to be 3,593 rows. The
estimated time for this step is 0.02 seconds.
4) We do an all-AMPs RETRIEVE step from
other_dev.deli_capital_prfrm_tgt by way of an all-rows scan with a condition of (
"other_dev.deli_capital_prfrm_tgt.launch_dt <= ((DATE )- 6)") into Spool 12 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 12 by row hash. The size of Spool 12 is estimated with no confidence to be 1,204 rows. The estimated time for this step is 0.01 seconds.
7) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step from Spool 10 (Last Use) by way of a RowHash match scan, which is joined to Spool 11 (Last Use) by way of a RowHash match scan. Spool 10 and Spool 11 are joined using a merge join, with a join condition of ("week_id = (week_id + 100)"). The result goes into Spool 13 (all_amps) (compressed columns allowed), which is built locally on the AMPs. The size of Spool 13 is estimated with no confidence to be 1,200 rows. The estimated time for this step is 0.02 seconds.
2) We do an all-AMPs JOIN step from store_dev.lu_store by way of a RowHash match scan with no residual conditions, which is joined to Spool 12 (Last Use) by way of a RowHash match scan. store_dev.lu_store and Spool 12 are joined using a merge join, with a join condition of ("store_dev.lu_store.store_id = store_id"). The result goes into Spool 14 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 14 by row hash. The size of Spool 14 is estimated with no confidence to be 1,204 rows. The estimated time for this step is 0.02 seconds.
8) We do an all-AMPs JOIN step from Spool 13 (Last Use) by way of a RowHash match scan, which is joined to Spool 14 (Last Use) by way of a RowHash match scan. Spool 13 and Spool 14 are joined using a merge join, with a join condition of ("(d_date >= launch_dt) AND
(week_id = week_id)"). The result goes into Spool 9 (all_amps) (compressed columns allowed), which is built locally on the AMPs. The size of Spool 9 is estimated with no confidence to be 1,977 rows. The estimated time for this step is 0.02 seconds.
9) We do an all-AMPs SUM step to aggregate from Spool 9 (Last Use) by way of an all-rows scan, and the grouping identifier in field 1. Aggregate Intermediate Results are computed globally, then placed in Spool 15. The size of Spool 15 is estimated with no confidence to be 1,483 rows. The estimated time for this step is 0.03 seconds.
10) We do an all-AMPs RETRIEVE step from Spool 15 (Last Use) by way of an all-rows scan into Spool 1 (all_amps) (compressed columns allowed), which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 1,483 rows. The estimated time for this step is 0.02 seconds.
11) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by wayof an all-rows scan into Spool 17 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 17 by row hash. The size of Spool 17 is estimated with no confidence to be 1,483 rows. The estimated time for this step is 0.02 seconds.
2) We do an all-AMPs RETRIEVE step from time_dev.lu_day by way of an all-rows scan with a condition of ("NOT (time_dev.lu_day.week_id IS NULL)") into Spool 18 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 18 by row hash. The size of Spool 18 is estimated with high confidence to be 5,117 rows. The estimated time for this step is 0.02 seconds.
12) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step from Spool 17 (Last Use) by way of a RowHash match scan, which is joined to Spool 18 (Last Use) by way of a RowHash match scan. Spool 17 and Spool 18 are joined using a merge join, with a join condition of ("WEEK_ID = week_id"). The result goes into Spool 2 (all_amps), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 2 by the sort key in spool field1 eliminating duplicate rows. The size of Spool 2 is estimated with no confidence to be 6,598 rows. The estimated time for this step is 0.33 seconds.
2) We do a group-AMP RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 22 (all_amps) (compressed columns allowed), which is duplicated on all AMPs. The size of Spool 22 is estimated with high confidence to be 28 rows. The estimated time for this step is 0.00 seconds.
13) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step from Spool 22 (Last Use) by way of an all-rows scan, which is joined to all partitions of stage_dev.div_item_day with a condition of ("(NOT(stage_dev.div_item_day.division_id IS NULL )) AND ((NOT(stage_dev.div_item_day.upc_id IS NULL )) AND (NOT
(stage_dev.div_item_day.txn_dt IS NULL )))"). Spool 22 and stage_dev.div_item_day are joined using a product join, with a join condition of ("(1=1)"). The result goes into Spool 23 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 23 by row hash. The size of Spool 23 is estimated with no confidence to be 2,981 rows. The estimated time for this step is 0.02 seconds.
2) We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by wayof an all-rows scan with a condition of ("(NOT (CORPORATION_ID IS NULL )) AND (NOT (DIVISION_ID IS NULL ))") into Spool 24 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 24 by row hash. The size of Spool 24 is estimated with no confidence to be 13,195 rows. The estimated time for this step is 0.03 seconds.
14) We do an all-AMPs JOIN step from Spool 23 (Last Use) by way of a RowHash match scan, which is joined to Spool 24 (Last Use) by way of a RowHash match scan. Spool 23 and Spool 24 are joined using a merge join, with a join condition of ("(division_id = DIVISION_ID)
AND (txn_dt = D_DATE)"). The result goes into Spool 25 (all_amps) (compressed columns allowed), which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 25 by row hash.
The size of Spool 25 is estimated with no confidence to be 342,426 rows. The estimated time for this step is 0.60 seconds.
15) We do an all-AMPs JOIN step from Spool 25 (Last Use) by way of a RowHash match scan, which is joined to stage_dev.upc by way of a RowHash match scan with a condition of ("(NOT
(stage_dev.upc.category_id IS NULL )) AND (NOT
(stage_dev.upc.upc_id IS NULL ))"). Spool 25 and
stage_dev.upc are joined using a merge join, with a join condition of ("(upc_id = stage_dev.upc.upc_id) AND
((stage_dev.upc.category_id = CATEGORY_ID) AND
(stage_dev.upc.corporation_id = CORPORATION_ID ))"). The result goes into Spool 21 (all_amps) (compressed columns allowed), which is built locally on the AMPs. The size of Spool 21 is estimated with no confidence to be 342,426 rows. The estimated time for this step is 0.31 seconds.
16) We do an all-AMPs SUM step to aggregate from Spool 21 (Last Use) by way of an all-rows scan, and the grouping identifier in field 1. Aggregate Intermediate Results are computed globally, then placed in Spool 26. The size of Spool 26 is estimated with no confidence to be 308,184 rows. The estimated time for this step is 0.52 seconds.
17) We do an all-AMPs RETRIEVE step from Spool 26 (Last Use) by way of an all-rows scan into Spool 19 (all_amps), which is built locally on the AMPs. The size of Spool 19 is estimated with no confidence to be 308,184 rows. The estimated time for this step is 0.25 seconds.
18) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
-> The contents of Spool 19 are sent back to the user as the result of statement 1. The total estimated time is 2.29 seconds.

dnoeth · Nov 3, 2006

Hi zanzemaj,
it looks like there are some statistics missing, because there are lots of "no confidence". And V2R6.1 is quite sensible for missing stats...

First try a "diagnostic helpstats on for session" and check explain for recommended stats...

And better show the query itself and the DDL and info about table sizes.

Dieter

zanzemaj · Nov 6, 2006

Thank you for the response dnoeth. How will I execute "diagnostic helpstats on for session". Sorry, but it's my first time to do optimization. Thanks...

Here's the query:
SELECT dates.division_id,
dates.launch_dt,
dates.store_id,
upc.category_id,
SUM(CASE WHEN date_ind = 1
AND dates.week_id = last_week.week_id THEN SUM_net_amt ELSE 0 END) last_wk_sales,
SUM(CASE WHEN date_ind = 1 THEN
SUM_net_amt ELSE 0 END) launch_thru_last_wk_sales,
SUM(CASE WHEN date_ind = 2
AND dates.week_id = last_week.week_id - 100 THEN SUM_net_amt ELSE 0 END) ly_week_sales,
SUM(CASE WHEN date_ind = 2 THEN
SUM_net_amt ELSE 0 END) ly_launch_thru_last_wk_sales
FROM stage_dev.vdiv_item_day ag,
(SELECT week_id
FROM dss_dev.lu_day
WHERE d_date = date - 6) last_week(week_id),
stage_dev.cap_perf_upcs upc,
(SELECT 1,
str.corporation_id,
str.division_id,
goals.store_id,
goals.launch_dt,
goals.category_id,
dy.d_date,
dy.week_id
FROM
dss_dev.deli_capital_prfrm_tgt goals,
dss_dev.lu_day dy,
dss_dev.lu_week wk,
dss_dev.lu_store str
WHERE
goals.week_id = dy.week_id
AND dy.week_id = wk.week_id
AND wk.week_end_dt >= goals.launch_dt
AND wk.week_end_dt <=date
AND str.store_id = goals.store_id
UNION
SELECT 2,
x.corporation_id,
x.division_id,
x.store_id,
x.launch_dt,
x.category_id,
dy.d_date ly_date,
x.week_id
FROM
dss_dev.lu_day dy,
(SELECT str.corporation_id,
str.division_id,
goals.store_id,
goals.launch_dt,
goals.category_id,
wk.week_id
FROM
dss_dev.deli_capital_prfrm_tgt goals,
dss_dev.lu_day dy,
dss_dev.lu_week wk,
dss_dev.lu_store str
WHERE
goals.week_id = dy.week_id
AND dy.week_id = wk.week_id + 100
AND dy.d_date >= goals.launch_dt
AND wk.week_end_dt <= date
AND dy.d_date <= date - 6
AND str.store_id = goals.store_id
GROUP BY 1,2,3,4,5,6 ) x (corporation_id,division_id,store_id,launch_dt,category_id,week_id)
WHERE
x.week_id = dy.week_id ) dates(date_ind,corporation_id,division_id,store_id,launch_dt,category_id,d_date,week_id)
WHERE
ag.txn_dt = dates.d_date
AND ag.upc_id = upc.upc_id
AND ag.division_id = dates.division_id
AND upc.category_id = dates.category_id
AND upc.corporation_id = dates.corporation_id GROUP BY 1,2,3,4;

dnoeth · Nov 7, 2006

Hi zanzemaj,
"diagnostic helpstats on for session" is a valid SQL statement, so just submit it.

Some remarks:
Replace UNION with UNION ALL
Try to replace the UNION Derived Tables with a single access to those 4 tables, the main difference is just this year/last year.

Btw, "total estimated time is 2.29 seconds.", doesn't sound that bad, what's the actual runtime?

Dieter

zanzemaj · Nov 8, 2006

Ok, I will try out your remarks. Thank you so much for sharing your expertise.

Btw, I'm not sure of actual runtime. The user just reported this me, that the query takes much time and that's not acceptable runtime for them.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Performance problem on Teradata SQL query

zanzemaj

Programmer

dnoeth

Instructor

zanzemaj

Programmer

dnoeth

Instructor

zanzemaj

Programmer

Similar threads

Part and Inventory Search

Sponsor