Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Updates much slower than pure inserts

Status
Not open for further replies.

brindelin

Programmer
May 20, 2005
1
US
Hello there,

When our ETL program was developed no create/update timestamps existed on the data files, meaning that we had to pull the whole file over with every load(3 times a day).

Now that we have implemented the timestamps I tried switching the datastage insert jobs to insert new rows / update exisiting.

While I saw a 25% improvement in the extract and transform stages(from working with less records) I saw a 400% increase in the insert/update stage.

Anyone have a similiar problem? I have several ideas to fix it, primarily deleting the records to be updated first then just doing a insert like old but I'm having problems getting that work.

I was wondering if anyone else has went through this problem and found an adequate solution.

Thank you,

 
I guess I can see a number of potential reasons for this increase in processing time to write to your target. Much of it likely has to do with how you were doing your inserts before and how you're doing your insert/updates now. It also has a lot to do with the design of the target table(s).

I'm not going to sit here and tell you did something right or wrong because I don't know all of the details of your situation. However, I will suggest some things to consider...

- Were you originally doing all of your inserts in a bulk mode? This would have been very fast, but may not have included any time to rebuild any indexes.
- What size batches were you using before and what size are you using now? Commit levels can certainly impact performance.
- What kinds of indexes are defined on your target table(s)? The update may not be ultizing it appropriately. The statistics on the table may not be up-to-date.

You also may be in a unique situation where you can "blow away" your target and rebuild every time you move data. So maybe you should stick with the pure insert approach. Most people don't have this luxury since they need to keep some additional meta-data or history in their target that can't be re-created.

Hope this helps...
 
You have remarked that about 400% increased in the Insert/Update stage? An can be more, because your process has increased by the Check, conversion and validation steps to rebuild the Timestamp based on the Target/Destination table type or if the column is defined in the table as an Index or Primary key and need Conversion, Check, rebuild index and validate. However, Write or Rewrite statements take CPU time more than the Read statement.


 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top