Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

The best way to filter fact table data?

Status
Not open for further replies.

mrees

Programmer
Feb 20, 2003
197
GB
We are devloping an OLAP reporting solution with SQK2k, Analysis Services and Crystal Analysis.

Each day our fact table will be updated for each customer account with the previous days figures. Obviously when creating a cube directly from this fact table the default aggregations will be the sum of all days for all accounts. This would be misleading for our users, as generally they only want to see the previous days posistion.

Can I have some suggestions as to the best way to do this, some ideas we've had are:
1. Create a view of the fact table, returning only the previous days posistion, and build the cube from that.

2. Create a "yesterday" dimension and add that to the cube design.

3. Restrict the data in Analysis Services (MDX Query?)

4. Something in Crystal Analysis (The wrong forum, I know)

Thanks in advance

mrees
 
You said "generally they only want to see the previous days" but that implies that sometimes they want more than the previous day.

I recommend the view approach, or some other way to incrementally update the cube on a daily basis.

Also, if you put each day in a separate partition, performance for anyone querying only "yesterday" will be enhanced.

Sometimes the grass is greener on the other side because there is more manure there - original.
 
Yes, most of the time they will only want the previous days stuff. But I think that other times they may want to compare days in the previous week/month/quarter, or each month end date.

We want to give our users as much flexibility as possible, but we still want to control what they can do. I can imagine some getting confused as to where the aggregated data comes from. So if I want to provide a "daily analysis" I only want them to have yesterdays data. If I want them to have a "historical analysis" then I will provide the days necessary. I appreciate that this is removing a lot of the flexibility from cubes, but after my initial investigation into what the business want, is that this flexibility will be provided by the many dimensions that are being asked for.

Thanks for recommending the view, as this was my favourite idea.

Thanks

mrees
 
You might even want to consider two cubes, one for yesterday and one for yesterday and more.

Sometimes the grass is greener on the other side because there is more manure there - original.
 
IF users will only want to see yesterday and older data then you may want to consider a combination of approaches. First would be the use of a view to build your cube(s) from. The second would be to use cube partitions that have a data slice applied. Since users are intrested in timne based data that would be a logical choice for partition slicing. Now the difficult part is to determin at what level in time you want to partition. A couple things to consider here are 1) Data volume 2) History built into the cube. Data volume will have an impact on build time and query time. Unless your building millions of records a day a monthly partition would be good. History in the cube will have an impact when it comes to managability and builds. y history in the cube i am reffering to how many days, weeks, months or years you will have stored. If your going to have 2 or 3 complete years of data it doesn't make sense to partition at the day level or even the week level as manageability will be enourmous and when monthly or yearly analysis are performed the number of partitions hit will be numerous. My experience has been that typical analysis takes place on a daily, monthly or Quarterly basis and sometimes yearly to show an over all comparison. Again I would go with monthly.

Using the views you could dynamically create your views based upon month where the current months view is filtered between the first of the month and getdate - 1, at the end of the month you just change the filter to be >= First of the month < First of the next month. To optimize cube builds you could do incremental processing and then at the end of the month when the view filter changes do a full partition process. Incremental updates to cubes can cause fragmentation of the data in the partition so a full process helps to optimize the partition.

Using DSO and VB you can automate the building and processing of your partitions.

&quot;Shoot Me! Shoot Me NOW!!!&quot;
- Daffy Duck
 
oh also depending on your reporting tool you could create a calculated member in your time dimension that is yesterday as well.

&quot;Shoot Me! Shoot Me NOW!!!&quot;
- Daffy Duck
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top