Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Strange results from joining different table sizes

Status
Not open for further replies.

coop207

Programmer
Apr 20, 2003
2
US
Hello,
I have a very large table (over 4.6 million records) and am joining a table with 300 rows to it and a table with 800 rows to it. All three tables have two fields and I have created a unique primary index on the two fields.
I ran the first join which took approximately 3 minutes (joining 4.6 million rows to 300 rows), then I ran the next join which took 20 seconds (joining 4.6 million rows to 800 rows).
Why would it take less time to join 800 rows than 300 rows?
This seemed strange to me so I tried joining other tables to the big table with the same results. I found a threshold of about 500 rows. After 500 rows all joins took only 20 seconds. How can I make the smaller table joins take less time? Is there a way to override this feature?
Thank you for your time,
Gina
 
I think a lot more info is needed to answer your question. If you are performing an "inner" join, the first join 300 rows to 4.6 million rows, the results are 300 rows so your second join is not 800 to 4.6 million, but 300 to 800.

Why are you running the joins one at a time? Why not put all the tables in the same query and let Teradata do the join order?

What does an Explain show?
 
Thank you for answering my question. I have just been running separate scripts for each join. I didn't know I could put all of the joins in one big script.
The explain says it is doing a product join for the 300 row table. That may be the problem. The product join may be too slow.
Thank you
 
The optimizer chooses a product join if it's supposed to be more efficient than other join types.

What statistics are defined on the tables? I'd guess, they're probably missing.
What is the Primary Index of the small tables?
Can you provide us with a Help Statistics and a Show Table for each table and the complete Explain output?

Dieter
 
If Teradata is duplicating the small table on all AMPS, a product join is the normal result.

Dieter is correct, I also believe that you may need to collect statistics on the compound indexes, and you may be better off with a non-unique primary index, then a unique primary index if the joins can be done on the AMPS without redistribution or duplication.
 
Hi,
You state in the first note.....

All three tables have two fields and I have created a unique primary index on the two fields.

again look at the explains.

EXPLAIN SELECT .....

and see what the text tells you is going on. maybe it is estimating the row counts to be completely different.

Since you have 4.6 million rows I suspect you aren't doing this the DEMO software, so contact the Teradata Global Support Center and send them a TSET bundle and have them open an incident.

That way teradata development can look at it an see why as you say there is a cut off at about 500 rows.

----
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top