Hey everyone,
I have a table (TABLE A) that has close to 9 million records. I am trying to add a new column to this table and then have this new column updated with values from another table (TABLE B).
I have tried two approaches so far:
1. Cursor Approach: Table A already has a reference to a column in Table B. So, I get a distinct list of values from the existing column in Table A, create a cursor and then update the new column where value in Table A matches the cursor value. Something like:
With this approach I can get some debugging output (print statement), for each set of data, something like how many rows are required to be updated, how many were updated, break/rollback if the numbers don't match etc. But the problem here is that Table A has 300,000 distinct values for the old column. As a result, the last time I ran this update, it ran for over 11 hours!!
2. Update-join approach: The second thing I tried was to do an update-from-join, like this:
With this approach, all the records are updated at once and I do not have control over the updates being done. This also takes 2 hours!
My questions are:
1. Is there any other better/optimum way to update tables with such huge amount of data?
2. Both these queries are hogging the entire DB (the second one to such an extent that I had to kill it and restart the server!). Is there a way to handle that?
Any help is highly appreciated.
I have a table (TABLE A) that has close to 9 million records. I am trying to add a new column to this table and then have this new column updated with values from another table (TABLE B).
I have tried two approaches so far:
1. Cursor Approach: Table A already has a reference to a column in Table B. So, I get a distinct list of values from the existing column in Table A, create a cursor and then update the new column where value in Table A matches the cursor value. Something like:
Code:
declare @table_A_column_value varchar(55),
@new_column_value int
declare cursor @cursor
set @cursor = cursor for select distinct tableA_oldcolumn from tableA
open @cursor
fetch next from @cursor into @table_A_column_value
while(@@fetch_status = 0)
begin
select @new_column_value = tableB_column from tableB where tableB_someColumn = @table_A_column_value
update tableA set new_column = @new_column_value where
tableA_oldcolumn = @table_A_column_value
fetch next from @cursor into @table_A_column_value
end
With this approach I can get some debugging output (print statement), for each set of data, something like how many rows are required to be updated, how many were updated, break/rollback if the numbers don't match etc. But the problem here is that Table A has 300,000 distinct values for the old column. As a result, the last time I ran this update, it ran for over 11 hours!!
2. Update-join approach: The second thing I tried was to do an update-from-join, like this:
Code:
update tableA set new_column = tableB.somecolumn
from tableA
inner join
tableB on tableA.oldcolumn = tableB.tableB_someColumn
With this approach, all the records are updated at once and I do not have control over the updates being done. This also takes 2 hours!
My questions are:
1. Is there any other better/optimum way to update tables with such huge amount of data?
2. Both these queries are hogging the entire DB (the second one to such an extent that I had to kill it and restart the server!). Is there a way to handle that?
Any help is highly appreciated.