Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Incremental Load (No primary key) 2

Status
Not open for further replies.

BTrees

IS-IT--Management
Aug 12, 2006
45
CA
Hello Everyone,

I have a scenario where I will have to load my table incrementally using a sql query. CDC not an option and no primary key in both source query and target table. Previously I was doing truncate table and simply reload using stored procedure. I want to reduce time and resources. I used MERGE statement and use combination of two filed in join (like T.c1 = S.c1 and T.c2 = s.C2) where C1 is binary data type and c2 is varchar. It didn't reduce my time. Any suggestion to achieve this?? Thanks
 
Is there a really good reason why you don't have a primary key on either the source or the destination?

Seems like a longshot but if you can't have a key in the source or the destination for some reason, it might be helpful to add them with SQL (temporarily), load up the destination, and then drop them.
 
What is the source? A flat CSV file?
What size is the data and the timing now?
And what percentage of data changes?
What is done? Inserts/Updates/Deletes? All?

If the whole source data in CSV is what you could read in a truncate/bulk load, I don't see a way to accelerate this, because you need to read through the CSV once fully anyway. Even if you could have a way to compare every CSV row to see whether it doesn't need to be read in, you need to read it to make that decision.

Time savers depend on being able to decide easily what data to not read in by just partial data, at best a datetime stamp or row version. And yes, a primary key is quite mandatory, even if its spanning two columns.

You can compare based on a checksum, but computing a checksum for all the new data also means reading through all of it, if it's not already part of the CSV file and you could decide to first just read in the two primary key related columns and that checksum and see if the target data has the same checksum at that key or it differs and you want that whole row.

Bye, Olaf.

Olaf Doschke Software Engineering
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top