Currently the DW has standard dimensions, for customer and product these are Type 1 SCD.
Mini dimensions have been created for changing hierarchy data on these dimensions (Type 2 SCD).
E.g. For customer the mini dimension is:
Cust_bus_hier_sk, cust_sk, <Customer Acc No, Cust Group, Business Type, Business Sector Type> --these are the levels in the hierarchy, Current_Y_null, from_date, to_date.
Whenever a customer has a level changed this creates a new sk and new row in the table.
The hier_sk is on the fact table so you can query by cust_group without needing to join to the customer dimension.
To perform like-for-like querying you join via the customer dimension and only select the current hier record.
The problem is this mini dimension grows quite fast and as at any level the parent can change, this means in Oracle we cannot implement a hierarchical relationship on the dimension as it breaks the 1 child : 1 parent relationship.
The query performance is also poor due to the volume involved (5m fact rows/month) joining to 100K in the mini dimension.
I have modelled dropping the lowest level out of the hierarchy (Cust Acc No) and only maintaining the distinct combinations of the other levels with an SK.
So I now have:
dim_customer: cust_sk, ..., current_hier_sk
dim_cust_bridge: cust_sk, hier_sk, from_date, to_date
dim_cust_hier: hier_sk, Cust Group, Business Type, Business Sector Type
Can still can join hierachy to sales without joining to customer dimension.
Can use current hierarchy against historic sales by joining fact to customer then customer to hierarchy on current_sk
By maintaining only the unique combinations of the hierarchy, I have trimmed the table from 100k rows down to 1,100 rows.
This has meant when creating materialised views these are incredibly small and therefore fast.
Can anyone tell me if this is a good or bad thing to do or can forsee problems. I'm doing this purely to resolve performance issues.
I know kimball says all dimensions should be conformed and obviously by removing the customer code, product code etc from these hierarchy tables breaks this rule.
Any advise either way would be much appreciated before I look to make these changes for real.
Mini dimensions have been created for changing hierarchy data on these dimensions (Type 2 SCD).
E.g. For customer the mini dimension is:
Cust_bus_hier_sk, cust_sk, <Customer Acc No, Cust Group, Business Type, Business Sector Type> --these are the levels in the hierarchy, Current_Y_null, from_date, to_date.
Whenever a customer has a level changed this creates a new sk and new row in the table.
The hier_sk is on the fact table so you can query by cust_group without needing to join to the customer dimension.
To perform like-for-like querying you join via the customer dimension and only select the current hier record.
The problem is this mini dimension grows quite fast and as at any level the parent can change, this means in Oracle we cannot implement a hierarchical relationship on the dimension as it breaks the 1 child : 1 parent relationship.
The query performance is also poor due to the volume involved (5m fact rows/month) joining to 100K in the mini dimension.
I have modelled dropping the lowest level out of the hierarchy (Cust Acc No) and only maintaining the distinct combinations of the other levels with an SK.
So I now have:
dim_customer: cust_sk, ..., current_hier_sk
dim_cust_bridge: cust_sk, hier_sk, from_date, to_date
dim_cust_hier: hier_sk, Cust Group, Business Type, Business Sector Type
Can still can join hierachy to sales without joining to customer dimension.
Can use current hierarchy against historic sales by joining fact to customer then customer to hierarchy on current_sk
By maintaining only the unique combinations of the hierarchy, I have trimmed the table from 100k rows down to 1,100 rows.
This has meant when creating materialised views these are incredibly small and therefore fast.
Can anyone tell me if this is a good or bad thing to do or can forsee problems. I'm doing this purely to resolve performance issues.
I know kimball says all dimensions should be conformed and obviously by removing the customer code, product code etc from these hierarchy tables breaks this rule.
Any advise either way would be much appreciated before I look to make these changes for real.