We have a star schema data warehouse that has an order fact that we update nightly. We use the natural key from the source (Order Number) to locate the fact row and apply changes. The source system is preparing to purge old orders so they can reuse order numbers. They will maintain a table (PurgeTable) that has the order number and the date it was purged. Once an order is purged, no updates can be done to that particular order. However, a new order can be created with that order number.
Our plan is to add a column to our fact table the holds purge date. The ETL process will look at only those fact rows that have a null in the purge date when it processes updates and inserts from the order table. We will also run a periodic job that reads the PurgeTable and marks orders as having been purged by setting the purge date in the fact to the value found in the source table.
My question is does this follow know standards? It seems pretty straight forward to me but I would appreciate input from you guys. Also, can either of you point me in the direction of a documented standard or best practice for dealing with retaining purged data in a data warehouse?
Thanks,
Thom
Our plan is to add a column to our fact table the holds purge date. The ETL process will look at only those fact rows that have a null in the purge date when it processes updates and inserts from the order table. We will also run a periodic job that reads the PurgeTable and marks orders as having been purged by setting the purge date in the fact to the value found in the source table.
My question is does this follow know standards? It seems pretty straight forward to me but I would appreciate input from you guys. Also, can either of you point me in the direction of a documented standard or best practice for dealing with retaining purged data in a data warehouse?
Thanks,
Thom