SQL Efficiency Question

mcconnellj · Dec 9, 2004

What is the most efficient route to go?

Joining a table that has 2 million records to a table that has 26 million and feeding the 26 million rolls into an OLAP model.

OR

Or having the 2 million row table join to a lookup table of about 100,000 rows and calculate on the fly the 26 million records (the computation is medium level in difficulty) and feed into the OLAP model.

Is this basically a question of CPU resources vs. I/O speed?

Is there a way of predicting the results without implementing both and comparing?

Thanks for your help!

baddos · Dec 9, 2004

Probably IO and network load. If your sending 26million records vs 2.1 million, then your bottleneck would only be the CPU. If you have a strong DB server that should be ok, but killing your HD and NIC isn't cool. ;-)

donutman · Dec 9, 2004

I know very little about OLAP, but doesn't an approach that spreads the task between I/O and computation have an advantage?
-Karl

[red] Cursors, triggers, user-defined functions and dynamic SQL are an axis of evil![/red]
[green]Life's uncertain...eat dessert first...www.deerfieldbakery.com[/green]

donutman · Dec 9, 2004

Ahh, you beat me to the post.

mcconnellj · Dec 9, 2004

Donutman,

I wasn't really sure about your comment of "dynamic SQL" being part of the axis of evil. I understand cursors, triggers and U/D functions are performance drains. But does this example fall under "dynamic SQL"?

I have a lookup table with Julian Dates. For each Julian Date, I have 12 possible calendar months. Each one of these calendar months has an associated allocation %.

So when I have 2 million transactions that lookup against the table, the result is 24 million rows with a calculated field of taking the amount X %.

Does this qualify as "dynamic SQL"? And going back to the initial question (now with more details), which is better, storing these values or calculating them?

donutman · Dec 9, 2004

No that's not dynamic SQL. I'd like to know more about the inter-relationship of the Julian Dates and calendar months. If I understand what your saying, why can't you create a cross-tab look-up table so that you have all 12 allocation %'s in one row?
-Karl

[red] Cursors, triggers, user-defined functions and dynamic SQL are an axis of evil![/red]
[green]Life's uncertain...eat dessert first...www.deerfieldbakery.com[/green]

BSman · Dec 10, 2004

What are you doing with the resulting 24 million rows? If you are doing further calculations or selecting only a subset of the results, then those calculations/selections should be done at the server and the results passed through the network (less network traffic).

I think that you need to analyze exactly what data you will need each time you will be going into the database and develop queries or stored procedures that will end up reducing the data in an order that minimizes the resulting number of rows.

For example, if you need totals for each date, first do a grouping query on your 2 million rows by date, then, on the resulting number of rows that are much fewer than 2 million, perform the calculations that multiplies the number of rows by 12.

SQLSister · Dec 10, 2004

With that many rows, precalulating is probably a good idea. I would tend towards the OLAP model, after all you only have to send all the records to it once, then you only need to send the changed records. And you can send the records in batches to avoid timing out your server. And do you really need all the rows in your queries? With that number of rows any way in which you can limit the records queried, will help.

Questions about posting. See faq183-874

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

SQL Efficiency Question

mcconnellj

MIS

baddos

MIS

donutman

Programmer

donutman

Programmer

mcconnellj

MIS

donutman

Programmer

BSman

Programmer

SQLSister

Programmer

Similar threads

Part and Inventory Search

Sponsor