Hi folks,
I am doing the design of our ETL process with DataStage. I have some big questions in mind and I can't find answers in DataStage's document. I need help or directions. Thanks in advance.
1. Does DataStage "commit" or "rollback" the result of a load automatically to keep data consistent? If not, do I need to control it by coding? Creating a routine, a stage or whatever?
2. Do I need to design jobs that can rollback the whole load in case something is wrong (e.g. the source data file is wrong)?
3. If only one job fails in a sequence, can I just re-run
that particular job? What is the common way?
4. What's the difference to direct rejected data to file or table?
5. I need to design extra jobs to reload rejected data after. Is that right?
6. Any other big issues should be addressed when designing ETL process?
Your insights are really appreciated.
Max
I am doing the design of our ETL process with DataStage. I have some big questions in mind and I can't find answers in DataStage's document. I need help or directions. Thanks in advance.
1. Does DataStage "commit" or "rollback" the result of a load automatically to keep data consistent? If not, do I need to control it by coding? Creating a routine, a stage or whatever?
2. Do I need to design jobs that can rollback the whole load in case something is wrong (e.g. the source data file is wrong)?
3. If only one job fails in a sequence, can I just re-run
that particular job? What is the common way?
4. What's the difference to direct rejected data to file or table?
5. I need to design extra jobs to reload rejected data after. Is that right?
6. Any other big issues should be addressed when designing ETL process?
Your insights are really appreciated.
Max