In my company we are looking for the best strategy for our initial load to the datawarehouse. Will appreciate if people can share different strategies that they have used to accomplish this goal. We are moving data worth 3 years and we have something like this..
Tab1 - 21 Million
Tab2 - 28 Million
Tab3 - 122 Million
Tab4 - 65 Million
Tab5 - 65 Million
Tab 1 ( one to many ) Tab 2
Tab 2 ( One to Many ) Tab 3
Tab 3 ( many to one ) Tab 4
Tab 3 ( Many to one ) Tab 5
Main Target table is at the Tab 3 level. There are other targets fed from other tables but this target is the most difficult to load as it needs some columns from all the 5 tables.
What would be the best strategy for one time load for this target. I tried joining these 5 tables at the source database and then put a date condition on Tab 2 (where create date is between d1 and d2) types, to create smaller loads.However even one month data is taking too long. What would you recommend ?
Is sorting and merging in files is better alternative or moving the source tables to another sataging database and then having a large temp segment to merge and sort to achieve the join ?
Tab1 - 21 Million
Tab2 - 28 Million
Tab3 - 122 Million
Tab4 - 65 Million
Tab5 - 65 Million
Tab 1 ( one to many ) Tab 2
Tab 2 ( One to Many ) Tab 3
Tab 3 ( many to one ) Tab 4
Tab 3 ( Many to one ) Tab 5
Main Target table is at the Tab 3 level. There are other targets fed from other tables but this target is the most difficult to load as it needs some columns from all the 5 tables.
What would be the best strategy for one time load for this target. I tried joining these 5 tables at the source database and then put a date condition on Tab 2 (where create date is between d1 and d2) types, to create smaller loads.However even one month data is taking too long. What would you recommend ?
Is sorting and merging in files is better alternative or moving the source tables to another sataging database and then having a large temp segment to merge and sort to achieve the join ?