Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SQL Precedence

Status
Not open for further replies.

craig322

MIS
Apr 19, 2001
108
US
Does anyone know how Teradata processes its SQL statements? In other databases I have been able to find out the order that joins and where clauses are processed.
For example, some databases process statements faster depending on the order tables are ordered in the FROM statement. Also, in some databases it can make a difference in the order statements are in the WHERE clause.

TIA

Craig
 
If you put the word EXPLAIN at the beginning of the SQL statement Teradata will return a listing of job steps. The only way to control the join order is to create derived tables or subqueries where you manage the spool table creation.
 
Perhaps you can make a recommendation then. I do a daily load into a sales table. From the table that is loaded, I need to create another table. The table I am creating uses processed information (summaries, etc) from the table loaded and existing tables for its data. I currently use a series of views dependent on each other in which the end view I do an insert using a select * into my new table. Do you know of a way that may be quicker than using views?

Thanks
 
Another point about processing in the Teradata RDBMS - The parsing engine receives the sql and translates it into machine language, checks accessrights, performs session mgmt and a plethora of other things. It has an optimizer - intelligent software - that chooses a 'plan' based on its knowledge of, or lack of, the tables involved.

It is recommended that you run Statistics on key tables weekly or monthly if the table definitions, demographics etc. change, so that the Optimizer can choose a more intelligent 'plan'. It is not unusual to get a 10 hour running job down to a few minutes simply by Collecting Statistics. Keep in mind that Teradata is massively parallel and data is distributed, evenly or not so evenly, across the virtual processors - AMPs - and so the execution plan is extremely important if we are to take advantage of the architecture.

Teradude
 
A couple of thoughts;
* Drop the secondary indexes on the final table, and recreate them after the load. Existing indexes can really slow the load down. Make sure you collect statistics after the load.

* If you have a Teradata version that allows join indexes, these can speed up processing.

* Like the response above, make sure your data is well distributed.
 
Craid322, If you can you provide more information on you situation, we may be able to help more. I am a Senior Consultant for Teradata. How big is the target detail table? How many rows are you adding per period? You may not want to drop secondary indexes, load, recreate SI, and recollect stats depending on these demographics. Also the version of Teradata would be helpful. Teradata's optimizer is very good so we do re-write a lot of queries for optimal performance but the from clause only affects outer joins due to the fact that they are not transitive. If A=B, B=C, then A=C. Order matters. Teradata still re-writes Outer joins based on stats, but with different rules.
 
Craig322, Sorry hit the enter key to quickly. Also would need current processing time, # of system nodes, chip speed,
target row size, fallback or not. These are just the basics, but will get me close enough to validate your range of performance.
 
The targe table will be about 11 million rows.

I don't add any rows, I do a total refresh once a month.

I am on the most recent version of Teradata, we just did
an update about 2 weeks ago.

My current processing time seems to be now at least 14 hours.

We have 2 nodes.

I don't know the chip speed, but can find that out if you tell me how.

I am not using fallback for this table.

I am in the process of trying to do the whole process by using interim tables, but just one join in that process is taking 5 + hours.

I would appreciate any help you can offer.
 
Craig322, Your second note says that you insert daily ??? into your sales table??? What is your specific problem?
I guess your real problem may be the aggregation / joins to build the summaries. Are you rebuilding your summary tables monthly also or weekly / daily? 14 hours doesn't sound bad for a monthly process, but there is always process redesign and tunning that can be done. I'll need more info to help.
 
Actually, when I posted this note I was working on a daily load file which takes several hours, not as long as the monthly. I do both monthly and daily loads. Most of them are pretty straight forward, but a handful are more complex.

TIA
 
For Teradata:

The order of preference for set operations is:
1. intersect
2. union
3. except from left to right

however, the evaluation can be modified with parenthese.

 
It has been a long time since this thread was started, but since it still has activity I thought I would check in. I know that 11 milliom may seem like a big table, but believe me, it isn't. If you are still taking 14 hours for the processing, I think you may have one of two problems. Your data is very skewed so one processor is doing all the work, or your SQL is creating a product (Catesian) join.

Put the word explain at the beginning of your SQL and look for things like "1:1" or "product join". Also look at the row numbers and processing time at each step and see where any large increases exist.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top