Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Performance issue in data warehouse

Status
Not open for further replies.

jisoo23

Programmer
Jan 27, 2004
192
0
0
US
I have a peculiar problem going on, hope someone can help with this....

I'm working with Informatica Powercenter 7.1 with mainframe and AIX DB2 databases. Here's the problem: I had several mappings the formerly used mainframe tables as sources. I changed it so that those mainframe tables are now loaded into stage tables and the stage tables have been implemented into the mappings as the new sources. I would think this would help the mappings run faster, but it doesn't. The gathering of source rows is not the problem, this goes very quickly. It's the updates or inserts into the target tables (also in the data warehouse) that are very very sluggish....we're talking 10 rows per second or less. Previously when I had it run with the mainframes sources those mappings ran around 1000 rows per second.

I've noticed that the problematic mappings all have look-ups on the target table to decide whether the rows should be updated/inserted. I tried disabling the lookup cache and/or increasing the DTM cache size with no improvement. I also changed the lookups from target lookups to source ones, no dice either. I would expect this behavior if the lookups were trying to query huge tables of 500,000 or something but this same problem happens on tables of just 4,000 rows or less as well. These mappings were run during times where no other jobs were running as well. Has anyone run into this sort of issue?

Thanks in advance,
jisoo23
 
Not really...
But the data-driven scenario with lookups to decide on insert/update may very well be a LOT slower than treating all rows as update and then setting target properties as insert/update else insert. Did you test performance of that strategy?

Ties Blom
Information analyst
 
I haven't tried that strategy yet...how is that configured in the target properties? Are you talking about the "update as insert" and "update as update" checkboxes? If so, does that mean that if the row does not exist in the target table, it will automatically switch to an insert? I appreciate the ideas.

Thanks,
Jisoo23
 
Indeed it does. If the key exists then update follows, otherwise an insert is performed. Of course the 'treat source rows as' should be set as update instead of insert.

The difference with the data-driven approach is , that prior to loading a record the writer will not KNOW beforehand which action to perform. So, this will be significantly slower than simple inserts.

There is even an option to delete records if the key is not present in the load, but this option did not function properly in (at least) versions 5 and lower.

Actually, the data-driven approach allows you more manipulation when required, but most often what you really want is just an update else insert mechanism...

Ties Blom
Information analyst
 
Well I finally figured out the problem. I did a test run using the UPDATE ELSE INSERT option with no change in performance (still >10 rows/sec). So I enabled "Collect Performance Data" under Performance in the Properties tab of the session. This let me see all the rows that went in and out of each transformation object in the workflow monitor. It showed that all rows went through all the way past the update transformation then stops abruptly at the target. I then changed my target in the mapping to a flat file for testing purposes. What took the old mapping 15 minutes to do (which was load 4700 rows) took only 32 seconds to a flat file. Thus it's the target table that's the problem, I've forwarded the issue to my data modeler...hey I just load the freakin' table, I don't design it =D

Thanks for the help blom0344.
 
Okay, but please respond again when you found the actual cause from the data-modeller. It is still interesting whythe upload is so slow!

Ties Blom
Information analyst
 
Apparently the target table has a bunch of triggers associated with it. Whenever I toss a row on there, triggers are executed that populate other tables such as a history table, etc. Apparently this caused a lot of overhead and hence slowed down my session/workflow bigtime. The design was a nice idea but ultimately convoluted.
 
I see, well with all the possibilities offered by workflows you should be able to build something 100% INFA to avoid having to use triggers...

Ties Blom
Information analyst
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top