I have a DW design question that I would like to float out to all you experts out there with the (rather selfish sorry) intention to help me come up with a suitable design with which i am struggling with.
I am required to build a warehouse, and appropriate cubes for mining, for my customer.
The operational database can be described as follows (simplified for first post):-
Product
Customer
Sale
SaleStatus
Now a customer can purchase a product. The purchase process can span a period of time (days/weeks). In the process the sale moves through a number of statuses (submitted, verified, approved, declined, sold etc). The sale can be move forward and backwards though these statuses which means the database holds multiple same statuses per sale. Status changes can happen multiple times in a single day.
My customer needs to report on the sale pipeline - which sales went through which statuses.
Product, Customer, Sale tables all can change daily and have elements that will need to track history.
All status changes are date time stamped and stored in the SaleStatus table.
My initial thoughts are a star design with the central fact driven by the events that occur within SaleStatus table.
My main concern is double counting and historic slicing.
To remove double counting I can set a flag that depicts the current sales status during the ETL and i have also thought about flagging the first and last occurrence of each status per sale.
But this does not help me when slicing by a date - what is the current status of a sale today is not what the current status was last month. Hmmm...
Hopefully that is enough info to drive a discussion but please ask if you need more.
Any help with this would find an extremely grateful person on the receiving end...
I am required to build a warehouse, and appropriate cubes for mining, for my customer.
The operational database can be described as follows (simplified for first post):-
Product
Customer
Sale
SaleStatus
Now a customer can purchase a product. The purchase process can span a period of time (days/weeks). In the process the sale moves through a number of statuses (submitted, verified, approved, declined, sold etc). The sale can be move forward and backwards though these statuses which means the database holds multiple same statuses per sale. Status changes can happen multiple times in a single day.
My customer needs to report on the sale pipeline - which sales went through which statuses.
Product, Customer, Sale tables all can change daily and have elements that will need to track history.
All status changes are date time stamped and stored in the SaleStatus table.
My initial thoughts are a star design with the central fact driven by the events that occur within SaleStatus table.
My main concern is double counting and historic slicing.
To remove double counting I can set a flag that depicts the current sales status during the ETL and i have also thought about flagging the first and last occurrence of each status per sale.
But this does not help me when slicing by a date - what is the current status of a sale today is not what the current status was last month. Hmmm...
Hopefully that is enough info to drive a discussion but please ask if you need more.
Any help with this would find an extremely grateful person on the receiving end...