A bit of a tricky design issue that I hope someone can throw some light on
We have dimensions with multiple hierarchies. All hierarchies are set up as level 1 - level 15. In the source data the different hierarchies for a single dimension are held in one table. The hierarchies are differentiated by the use of a 'structure code'. For example in the product dimension the columns in the source table are:
product_code,
structure_code,
level_01_code, level_01_desc,
level_02_code, level_02_desc,
...
level_15_code, level_15_desc
Hierarchies change over time and there is a requirement to track this change. The hierarchies are ragged and do not run from level 01 to level 15. For example the immediate parent of a product 'A' may be in level 8 whereas a product 'B' may have its immediate parent in level 6. In a different hierarchy product 'A' may have an immediate parent in level 7. Having said that I don't think that this would influence the design - but I could be wrong.
We also have dimensional attributes for product that also require change tracking over time such as 'Product Type'
product_code,
product_type
...
The changes in the hierarchies occur independently of the changes in the dimensional attributes.
The problem is how should we structure the database to join a fact table that includes the product dimension to the dimension and hierarchies given the many-to-many relationships between the various tables. We need to have SCD2 type tracking on both the hierarchies and the dimension.
We have come up with three different approaches to this and as many arguments as to the direction we should go in. I have my own preferred solution but I'd be interested to hear what the 'collective brain' thinks the best approach should be
Any thoughts?
Cheers, Tom
We have dimensions with multiple hierarchies. All hierarchies are set up as level 1 - level 15. In the source data the different hierarchies for a single dimension are held in one table. The hierarchies are differentiated by the use of a 'structure code'. For example in the product dimension the columns in the source table are:
product_code,
structure_code,
level_01_code, level_01_desc,
level_02_code, level_02_desc,
...
level_15_code, level_15_desc
Hierarchies change over time and there is a requirement to track this change. The hierarchies are ragged and do not run from level 01 to level 15. For example the immediate parent of a product 'A' may be in level 8 whereas a product 'B' may have its immediate parent in level 6. In a different hierarchy product 'A' may have an immediate parent in level 7. Having said that I don't think that this would influence the design - but I could be wrong.
We also have dimensional attributes for product that also require change tracking over time such as 'Product Type'
product_code,
product_type
...
The changes in the hierarchies occur independently of the changes in the dimensional attributes.
The problem is how should we structure the database to join a fact table that includes the product dimension to the dimension and hierarchies given the many-to-many relationships between the various tables. We need to have SCD2 type tracking on both the hierarchies and the dimension.
We have come up with three different approaches to this and as many arguments as to the direction we should go in. I have my own preferred solution but I'd be interested to hear what the 'collective brain' thinks the best approach should be
Any thoughts?
Cheers, Tom