Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Fastest UPDATE of 1 Million Record Table

Status
Not open for further replies.

ruse7013

Programmer
Apr 10, 2003
25
US
Environment:
Oracle 8i (8.1.7)

Scenario:
Three columns are retrieved -
CURSOR cur_name IS
SELECT c1, c2, c3
FROM tab_A;

Let's say the original retrieved values are equal to:
c1=old_value1, c2=old_value2, c3=old_value3

Some processing take place and we find new values of c1, c2, and c3.

Final step is to update columns c1, c2, c3 in tab_A with the new values.

Question:
What is the fastest way of doing this? (A), (B), or (C)?

Option A:
CURSOR cur_name IS
SELECT c1, c2, c3
FROM tab_A;

FOR rec IN cur_name LOOP

... processing of c1,c2,c3
... getting new values of c1,c2,c3

UPDATE tab_A
SET c1 = new_value1, c2 = new_value2, c3 = new_value3
WHERE c1 = rec.old_value1, c2 = rec.old_value2, c3 = rec.old_value3;
END LOOP;

Note: Created nonunique index on (c1,c2,c3).

Option B:
CURSOR cur_name IS
SELECT c1, c2, c3, ROWID
FROM tab_A;

FOR rec IN cur_name LOOP

... processing of c1,c2,c3
... getting new values of c1,c2,c3

UPDATE tab_A
SET c1 = new_value1, c2 = new_value2, c3 = new_value3
WHERE ROWID = rec.ROWID
END LOOP;

Note: The best elapsed time I got was 224 seconds for updating 1,021,000 records.

Option C:
any other idea?

Thank you.
<ruse7013>


 
Hi,
Make use of Bulk collect & Update.

Regards
Himanshu
 
Hi Himanshu,

I've done it. It didn't improve the elapsed time too much. However, I had a challenge of declaring the type of the bulk collection that would keep the ROWID values. Is this correct:

TYPE typ_Rowid IS VARRAY(1000) OF VARCHAR2(50);

I tried:
TYPE typ_Rowid IS VARRAY(1000) OF ROWID;

... but it didn't work.

Any idea is appreciated.

Thanks,
<ruse7013>
 
Does that "transformation" consume a lot of resources? Is it possible to make plain update?

Code:
UPDATE tab_A
   SET c1 = f1(c1,c2,c3), c2 = f2(c1,c2,c3), c3 = f3(c1,c2,c3)

Don't forget to choose an appropriate RBS

Regards, Dima
 
Hi sem,

Please take a look at:

... processing of c1,c2,c3
... getting new values of c1,c2,c3

c1,c2, and c3 are interdependent. A function is called w/ the old values as IN params. For every cursor iteration, ALL three old values are taken into account in order to determine what the three new values are. Thus, I cannot do a separate processing (i.e. f1(), f2(), etc.) for just "c1", or just "c2", etc.

Hope this provides more info.

Thanks,
<ruse7013>
 
Hi sem,

The value conversion doesn't take too much time. So, I'll test your idea and see what happens.

BTW, what's RBS?

Thanks,
<ruse7013>
 
Rollback segment. It should be large enough to avoid expanding it during execution. Use SET TRANSACTION USE ROLLBACK SEGMENT to choose it explicitly or just turn all others small segments off-line.

Regards, Dima
 
Hi sem,

In your suggestion above:

UPDATE tab_A
SET c1 = f1(c1,c2,c3), c2 = f2(c1,c2,c3), c3 = f3(c1,c2,c3)

where is the "WHERE" clause?

I guess, there should be one. Otherwise, how do we know that the update is done for a specific group of c1,c2,c3?

In other words:
UPDATE tab_A
SET c1 = f1(old_c1, old_c2, old_c3),
c2 = f2(old_c1, old_c2, old_c3),
c3 = f3(old_c1, old_c2, old_c3)
WHERE c1 = old_c1
AND c2 = old_c2
AND c3 = old_c3;

Regards,
<ruse7013>

 
Do you need to update ALL table rows? If so, what is that WHERE clause for? That statement updates ALL rows. For each row it:

1. Retreives c1, c2, c3 vaules
2. Calculates f1(c1, c2, c3),f2(c1, c2, c3),f3(c1, c2, c3)
3. Populates c1, c2, c3 fields with values calculated above.


Is that what you need?

Regards, Dima
 
Hi sem,

Yes, that's what my challenge is.

If you take a look at my original message, it's all said there.

Thanks,
<ruse7013>
 
If so I can not understand your further doubts. If you take a look at my answer it's also all said there. BTW have you tried it?

Regards, Dima
 
Hi sem,

Thank you for your idea. I tried it. The elapsed time does not look very promising - close to 10 minutes compared to my best of 224 seconds.

I need to take another avenue. Thanks.

Regards,
<ruse7013>
 
Can not beleive, did you run your test within the same environment? Load, locks, RBS? Can you publish a listing with both tests and timing on. BTW options a and b from your starting post are equivalent only where c1, c2, c3 form unique key.

Regards, Dima
 
Hi sem,

Unfortunately, I'm under the gun to finish this task and don't have time to publish what you've asked me.

The key here is - how we update the records in tab_A - one at a time or many at a time. Your idea is great, no doubt about that! Throughout the course of this discussion, both you and HimanB have helped me a lot to think ... and I came up with the following solution:

1. Found a way to create (within 5-8 seconds) a small lookup table (w/ less than 200 rows) that "translates" the unique old to new values of c1,c2,c3.

2. Once I've than that, the rest was a piece of cake.

3. I created a cursor that loops through that lookup table and updates tab_A with bulk collections.

This approach decreased the elapsed time to 120 seconds.

Furthermore, I have tried other approaches like the one that is suggested on AskTom, i.e.

CREATE TABLE new_table as select <do the update "here"> from old_table;

Unfortunately, the latter haven't delivered good results, either.

Once again, I want to thank you for your great ideas! Moreover, I need to move on to my next performance optimization project.

Thank you.
<ruse7013>
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top