Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Aggregation tables for reporting

Status
Not open for further replies.

sqlcasey

Programmer
Sep 21, 2006
150
US
Hi,

I have been tasked with creating some reports, that will be viewed on a weekly basis, meaning the data for the reports only needs to be updated once a week.

Basically my question is, is my approach correct. I don't want to run queries against live data, so the idea is to create separate tables that will be populated during off-hours with the aggregated data. Then once the tables are populated, then run the queries against the pre-aggregaged data.

Are there any FAQ's that I can refer to, or does this sound about right.

Thanks
 
See Books Online topic "reports, data warehouse".

Tables containing aggregated data and with the addition of columns representing a timeline or history in terms of weeks, months, quarters, and years are commonly used. These are useful for examining trends. Also there is performance efficiency because aggregates and recoding are done one time. These tables may not be normalized, instead they resemble flat files to simplify searches. The reports run fast and do not interfere with transactions.

Google "OLAP and OLTP database design".
 
1. Where will the 'aggregated tables' be? On the same server? If so, then where is the benefit?

2. If you will have the data on a 'reporting server', then look into transactional replication. You can replicate the data to the reporting server as often as you want, we do every 5 minutes from a busy database. Then create the reports whenever needed.

-SQLBill

Posting advice: FAQ481-4875
 
1. Where will the 'aggregated tables' be? On the same server? If so, then where is the benefit?

the benefit comes from the fact that we won't be using the same table our customers are using at the same time

the other benefit is pre-aggregated data, already stored in the table, so you don't calculate on the fly

i think these are fairly significant benefits
 
i think everybody is missing my point here

thanks anyways

:)
 
sqlcasey,

I'm with you. I suggest you read this article as a starting point. There are plenty of links within it that will lead to other informative articles.

Generally speaking, I'm not a fan of data warehousing because you don't get the most current data. I do realize the benefits though. If calculating the aggregate data will cause performance problems with the 'live' data, then it is a viable solution. Typically, it is only 'big' databases that require this type of solution, but some smaller databases may also benefit from it.

Typically, you create a job in sql server that is scheduled to run on a particular interval (nightly?). This job would run a stored procedure that calculates the aggregate data and stores it in another table. Then, when reporting on the aggregate data, you use the new tables to pull the data.

Let me give an example. I recently worked on a project that tracked vehicles with GPS locators. The hardware captures a position every 10 seconds, so with a large fleet of vehicles, the database became rather large. Since there were only about 6 reports that used historical data (positions earlier than today), I decided to create a table (or 2) for each report. Just after midnight, I would 'archive' the data to the 'report' tables. I would use the report tables for the historical reports. The net effect was that the users wouldn't have to wait a minute or 2 for the reports to print. Since I used the 'report' tables, the data for the report was returned from the database in less than a second. It certainly made for happy customers.

Again, I suggest you read the article that I mentioned at the top of this thread. It should be a good starting point for what you are trying to accomplish.

Good Luck.

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
thanks george

this is a massive and confusing topic, i am discovering

as you point out, i realize that data warehouse is what i'm trying to decipher, now i'm left with a lot of unanswered questions, such as

how does Datawarehouse-ing relate to OLAP and Analysis Services... is an Analysis Services database the same thing as a data warehouse... is it just me or is this a very overwhelming topic, which would explain why i can't get a straight answer on anything, ha

case in point, from bol:

You can create a traditional OLAP design, in which Analysis Services is used to provide pre-aggregated data in a multidimensional structure. A traditional OLAP design makes understanding, navigating, and exploring prepared business data faster and easier for business users.

Or you can create a Unified Dimensional Model (UDM) design, in which Analysis Services acts as a business intelligence portal. A UDM design provides direct access to multiple, heterogeneous data sources in a unified, consistent structure and uses a single method of transport for queries, data, and metadata. For more information, see Unified Dimensional Model (SSAS).

and on and on...
 
This is a complicated subject because each 'solution' has benefits and drawbacks.

My suggestion is...

1. Create tables that accomodate the reports you need to run. Don't worry about normalizing the tables. In fact, you should de-normalize the tables to accomodate the reports.

2. Create a stored procedure that extracts the data from the live tables, calculates the aggregates, and inserts the data in to the reporting tables.

3. Schedule this job to run every night.

4. Create stored procedures to get the aggregate data from the reporting tables (for the actual reports).

5. Create a front end app that uses the stored procedures (from step 4) to return the report data and display for the users.

You still need to use common sense and good coding practices for the reporting tables. You'll need indexes because the reporting tables can get large. You'll still want to use the appropriate data types for the columns (ex. don't stores dates in the format that the report shows them, actually use the DateTime data type).

I hope this helps.

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
sqlcasey,

You are right - I missed your point. I thought that you were having issues on the server (locks, etc) that were slowing down the access of data. That's why I posted my questions...if it has been a matter of accessing the data and having lots of locks, etc...then you might only get a real benefit from moving the aggregated data off the server.

-SQLBill

Posting advice: FAQ481-4875
 
Hi,

Thanks for the posts. George, we have implemented your solution exactly. I guess my question is, what would be the next step, so to speak, using SQL Server solutions, such as OLAP and Analysis Services? That is, when do you know when to use these tools?

Thanks
 
Uh... I don't bother with any of that.

Just re-write the reports to use the new tables. In many cases the reports will be a lot simpler because the aggregation of data is already done.



-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
No, we're not going to any advanced stuff for now. It was more just an informational question, mostly for myself, because I seriously doubt we would ever use this here.

:)
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top