I need some help thinking this out...
I have a staging table that gets loaded with data from a file using ssis. The file is a rolling history file, in that every day's current file not only contains that day's data, but also data for the previous days in the month. So for example, if today is 5/3, the data might look like:
2008-05-01, col1, col2, col3, col5
2008-05-01, col1, col2, col3, col5
2008-05-02, col1, col2, col3, col5
2008-05-03, col1, col2, col3, col5
2008-05-03, col1, col2, col3, col5
2008-05-03, col1, col2, col3, col5
Next, I need to load the current day's data (5/3) from the staging table into a production table... however, I only want data from 5/3 because the data from 5/1 and 5/2 are already in the production table.
There is also a possibility that on 5/3, the powers that be may have added or corrected rows from the previous days and added to the file for 5/3... so it isn't enough to just add rows with the 5/3 date on them, because there might be a new or corrected row for 5/2, say.
My idea is to do some sort of join that checks the complete row value (i.e. all the columns) in the staging table against the rows in the production table, so that IF there is a new or corrected row for 5/2, it will insert or update that row along with all the rows for 5/3.
Thanks much!
I have a staging table that gets loaded with data from a file using ssis. The file is a rolling history file, in that every day's current file not only contains that day's data, but also data for the previous days in the month. So for example, if today is 5/3, the data might look like:
2008-05-01, col1, col2, col3, col5
2008-05-01, col1, col2, col3, col5
2008-05-02, col1, col2, col3, col5
2008-05-03, col1, col2, col3, col5
2008-05-03, col1, col2, col3, col5
2008-05-03, col1, col2, col3, col5
Next, I need to load the current day's data (5/3) from the staging table into a production table... however, I only want data from 5/3 because the data from 5/1 and 5/2 are already in the production table.
There is also a possibility that on 5/3, the powers that be may have added or corrected rows from the previous days and added to the file for 5/3... so it isn't enough to just add rows with the 5/3 date on them, because there might be a new or corrected row for 5/2, say.
My idea is to do some sort of join that checks the complete row value (i.e. all the columns) in the staging table against the rows in the production table, so that IF there is a new or corrected row for 5/2, it will insert or update that row along with all the rows for 5/3.
Thanks much!