Dynamic maintenance of a summary table from a detail table

azzazzello · May 4, 2011

Hi,

Suppose I have 2 tables.

One is called "detail"

Code:

date DATE
account_id int
stock varchar(10)
trades int
pnl int
PRIMARY KEY (date,account_id,stock)

This table is useful for research, however I expect that 95% of the queries will be grouping by date and account_id. Since there are thousands of stocks, aggregation would get rid of a huge number of rows. I therefore would like to create a table called "summary" to deal with the vast majority of the queries. The table would look like this

Code:

date DATE
account_id int
trades int
pnl int
PRIMARY KEY (date,account_id)

You can see that summary is the same as "detail" except that there is no "stock" column.
"detail" table will have inserts and occasional updates to it. I would like to make it so that every time there is an insert/update to "detail", a relevant insert/update takes place in the "summary" table. New entries are easy - I can create a trigger/stored procedure to update the summary table with new values. However, what can I do for updates? They will clear away old values and set new ones. How can I handle that? The only thing that comes to mind is update the "summary" table with a select from the detail one once in a while, however that means there is potentially stale/incorrect data in the summary table.
I'd love to listen to ideas

DonQuichote · May 9, 2011

Ouch. This really sounds like premature optimization. Worse, this sounds like you are building a database system on top of a database system. Instead of automagic updates to a table, you'd better add indexes to the real table. Even if you do go for the automagic aggregation table, those indexes are probably needed much more than the aggregation. But as long as you don't have any problems, I'm afraid that you are only introducing them.

That said, a trigger can be set on an update as well as on an insert. But data redundancy will remain a problem. Make sure you check the consistency of the redundant data regularly and look out for situations where triggers may not automatically run (maybe procedures or other triggers).

+++ Despite being wrong in every important aspect, that is a very good analogy +++
Hex (in Darwin's Watch)

azzazzello · May 9, 2011

The reason I am "over-engineering" in this way is because I have 50M+ records and growing and because Mysql has a documented bug in using composite indexes when the first is a range (see my previous thread). The problem that I am having is that because of this, a fairly standard query that should take sub-seconds can run for over 30 seconds.

DonQuichote · May 10, 2011

Ah. Does adding "simple" indexes help? I mean indexes for just the date column and another one for just the account_id column. MySQL might pick the most useful of those, or you can even use a "FORCE INDEX(index_name)" clause in the select queries.

+++ Despite being wrong in every important aspect, that is a very good analogy +++
Hex (in Darwin's Watch)

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Dynamic maintenance of a summary table from a detail table

azzazzello

Technical User

DonQuichote

Programmer

azzazzello

Technical User

DonQuichote

Programmer

Similar threads

Part and Inventory Search

Sponsor