JBroadwater
Programmer
I initially understood Load (the L part of ETL -- Extract, Transform, Load) to mean copying prepared (complete, not partial) fact+dimension tables from a stage area to a presentation area.
However, the concept of an incremental Load, appears to conflict with this understanding. I.e. you can incrementally update fact/dim tables in your staging area, but how can you then refresh your presentation area incrementally? A load of the presentation area implemented as a swap out of the most recent partition of a fact table is one way to resolve this apparent conflict, but frankly, I don't see any way other than swapping a partition (swapping an entire table isn't really incremental is it?). Maybe incrementally updating a stage area fact table, then using an RDBMS/Oracle replication facility to apply the same incremental update to the corresponding presentation area fact table?
I.e. replacing some rows in a fact or dimension table already sitting on a presentation server would seem to violate the principle of stage / presentation area separation. Adding or changing rows in a fact or dimension table sure sounds like a purely staging area activity to me.
So, to be clear, I am incrementally updating dim/fact tables in my staging area, but then I'm copying the entire tables from stage to presentation areas. And I'm wondering if other people have a different understanding or can suggest a better approach.
However, the concept of an incremental Load, appears to conflict with this understanding. I.e. you can incrementally update fact/dim tables in your staging area, but how can you then refresh your presentation area incrementally? A load of the presentation area implemented as a swap out of the most recent partition of a fact table is one way to resolve this apparent conflict, but frankly, I don't see any way other than swapping a partition (swapping an entire table isn't really incremental is it?). Maybe incrementally updating a stage area fact table, then using an RDBMS/Oracle replication facility to apply the same incremental update to the corresponding presentation area fact table?
I.e. replacing some rows in a fact or dimension table already sitting on a presentation server would seem to violate the principle of stage / presentation area separation. Adding or changing rows in a fact or dimension table sure sounds like a purely staging area activity to me.
So, to be clear, I am incrementally updating dim/fact tables in my staging area, but then I'm copying the entire tables from stage to presentation areas. And I'm wondering if other people have a different understanding or can suggest a better approach.