Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Updating a huge table

Status
Not open for further replies.

ChandruN

Programmer
Jun 17, 2003
13
US
I need to update a huge table

Total no of records in the table : 1.5 billion
Total size occupied by the table : 470GB

No of Rows to be updated : 0.5 billion rows.

Can any one suggest me a good plan ( faster way to update )

Thanks in Advance
Chandru
 
I just recovered from aborting an update to a 300 million row table. I had 1 million update records. The Update ran for 20 hours and I foolishly aborted the process. The rollback took 66 hours. This was 3 days without a major table.

My recommendations, AVOID UPDATES. Use a select into an empty temp table, drop the original table, and rename the temp table.

If this process does not meet your needs, make sure you DROP ALL secondary indexes before the update.
 
Sorry, did not notice that you had to update almost 30% of rows in the table. I agree that using the insert select into a temp table is definitely the best option.
 
ChandruN:

- Update
pro: easy to submit ;-)
contra: needs space for the transient journal and a loooong time to rollback in case of failure

- Insert/Select using CASE to modify the data on the fly
pro: probably the fastest way, immediate rollback. Only a read lock on source table during copy, fast switch to the new version using drop old_table; rename new_table to old_table.
contra: you need the perm space to store the copy of the table

- export all rows including the modifications and use FastLoad
pro: No rollback. Read lock/Fast switch similar to Insert/Select. If there's not enough perm space to hold the copy, you can delete the table before fastloading.
contra: you have to export all the data (if there's enough perm space without storing on system disks using export -> named pipe -> fastload). If there's no perm space for the copy table is not accesible during the load.

- export the modified data plus the PK and use MLoad
pro: less data to export (compared to FastLoad)
contra: perm space needed for work tables. Table only accesible during load with Access lock.

Everything is better than updating 30% of rows ;-)

BillDHS:
20 hours for 1 mio updates is extremely slow. Did you modify the primary index and/or did you have secondary indexes/foreign keys/join indexes and modified those columns?

In V2R5 updates are *much* faster now, because there's an update in place instead of delete/merge.
In your case (< 1%) this might be faster than insert/select now, but even if it is, there's still the possible rollback :-(
Rollback ss probably also faster now, but i never tested it.

Dieter
 
Dieter:
I assumed that when changing < 0.03% of rows I could update the table. BIG mistake! I had 12 NUSI, some extremely non-unique. We are on V2R3 so TD still does delete /insert.

bill
 
The maintenance costs for a NUSI with lots of rows per value are higher than for just a few rows per value.

But which columns did you update?
If you modify a NUSI column, there's maintenance only for that NUSI.
But if you modify the PI this results in a new RowHash and the RowHash as part of the RowID is used within any secondary index -> all NUSIs must be maintained.

Dieter
 
The attribute I was updating had neither a PI or SI. However, when the rows were removed (deleted from the table) the all indexes had to be updated, and when the tj rolled back, it was inserting rows that had to maintain all 13 indexes.

With V2R3 I don't believe it mattered if I was updating an index column, the maintenance was during the delete / insert. I think your rational only applies if the rows are updated in-place.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top