Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Dimension hierarchies joining to fact

Status
Not open for further replies.

Chorlton

IS-IT--Management
Aug 4, 2005
2
GB
Currently the DW has standard dimensions, for customer and product these are Type 1 SCD.

Mini dimensions have been created for changing hierarchy data on these dimensions (Type 2 SCD).

E.g. For customer the mini dimension is:

Cust_bus_hier_sk, cust_sk, <Customer Acc No, Cust Group, Business Type, Business Sector Type> --these are the levels in the hierarchy, Current_Y_null, from_date, to_date.

Whenever a customer has a level changed this creates a new sk and new row in the table.

The hier_sk is on the fact table so you can query by cust_group without needing to join to the customer dimension.

To perform like-for-like querying you join via the customer dimension and only select the current hier record.

The problem is this mini dimension grows quite fast and as at any level the parent can change, this means in Oracle we cannot implement a hierarchical relationship on the dimension as it breaks the 1 child : 1 parent relationship.

The query performance is also poor due to the volume involved (5m fact rows/month) joining to 100K in the mini dimension.

I have modelled dropping the lowest level out of the hierarchy (Cust Acc No) and only maintaining the distinct combinations of the other levels with an SK.

So I now have:

dim_customer: cust_sk, ..., current_hier_sk
dim_cust_bridge: cust_sk, hier_sk, from_date, to_date
dim_cust_hier: hier_sk, Cust Group, Business Type, Business Sector Type

Can still can join hierachy to sales without joining to customer dimension.
Can use current hierarchy against historic sales by joining fact to customer then customer to hierarchy on current_sk

By maintaining only the unique combinations of the hierarchy, I have trimmed the table from 100k rows down to 1,100 rows.

This has meant when creating materialised views these are incredibly small and therefore fast.

Can anyone tell me if this is a good or bad thing to do or can forsee problems. I'm doing this purely to resolve performance issues.

I know kimball says all dimensions should be conformed and obviously by removing the customer code, product code etc from these hierarchy tables breaks this rule.

Any advise either way would be much appreciated before I look to make these changes for real.
 
Data Warehousing is almost all about performance. We trade off-hours processing (night/weekend) for increased performance during business hours. If you are getting performance gains, you are probably doing good things.

Yes, DW is also about making the data easier for the user to get at and understand.

I assume you have looked at and understand the Oracle CONNECT BY extension to SQL. You might also consider reading Celko's Trees and Hierarchies in SQL for Smarties or doing a search on that subject in online sites for Intelligent Enterprise and DM Review. Note: the Celko book is *not* light reading.

-------------------------
The reasonable man adapts himself to the world. The unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man. - George Bernard Shaw
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top