Strange results from joining different table sizes

coop207 · Apr 20, 2003

Hello,
I have a very large table (over 4.6 million records) and am joining a table with 300 rows to it and a table with 800 rows to it. All three tables have two fields and I have created a unique primary index on the two fields.
I ran the first join which took approximately 3 minutes (joining 4.6 million rows to 300 rows), then I ran the next join which took 20 seconds (joining 4.6 million rows to 800 rows).
Why would it take less time to join 800 rows than 300 rows?
This seemed strange to me so I tried joining other tables to the big table with the same results. I found a threshold of about 500 rows. After 500 rows all joins took only 20 seconds. How can I make the smaller table joins take less time? Is there a way to override this feature?
Thank you for your time,
Gina

BillDHS · Apr 20, 2003

I think a lot more info is needed to answer your question. If you are performing an "inner" join, the first join 300 rows to 4.6 million rows, the results are 300 rows so your second join is not 800 to 4.6 million, but 300 to 800.

Why are you running the joins one at a time? Why not put all the tables in the same query and let Teradata do the join order?

What does an Explain show?

coop207 · Apr 21, 2003

Thank you for answering my question. I have just been running separate scripts for each join. I didn't know I could put all of the joins in one big script.
The explain says it is doing a product join for the 300 row table. That may be the problem. The product join may be too slow.
Thank you

dnoeth · Apr 21, 2003

The optimizer chooses a product join if it's supposed to be more efficient than other join types.

What statistics are defined on the tables? I'd guess, they're probably missing.
What is the Primary Index of the small tables?
Can you provide us with a Help Statistics and a Show Table for each table and the complete Explain output?

Dieter

BillDHS · Apr 21, 2003

If Teradata is duplicating the small table on all AMPS, a product join is the normal result.

Dieter is correct, I also believe that you may need to collect statistics on the compound indexes, and you may be better off with a non-unique primary index, then a unique primary index if the joins can be done on the AMPS without redistribution or duplication.

tdatgod · Apr 21, 2003

Hi,
You state in the first note.....

All three tables have two fields and I have created a unique primary index on the two fields.

again look at the explains.

EXPLAIN SELECT .....

and see what the text tells you is going on. maybe it is estimating the row counts to be completely different.

Since you have 4.6 million rows I suspect you aren't doing this the DEMO software, so contact the Teradata Global Support Center and send them a TSET bundle and have them open an incident.

That way teradata development can look at it an see why as you say there is a cut off at about 500 rows.

----

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Strange results from joining different table sizes

coop207

Programmer

BillDHS

Programmer

coop207

Programmer

dnoeth

Instructor

BillDHS

Programmer

tdatgod

Programmer

Similar threads

Part and Inventory Search

Sponsor