SQL 2005
Managment Studio
Hi All,
I have a 2 tables as such:
Table1
uniqueID,CC,MakeCode,ModelCode,Body,Date,MatchID,CompleteCode (First 6 columns have data, last 2 are null)
Records in Table1, around 45million
Table2
CC,MakeCode,ModelCode,Body,Date,CompleteCode (All columns have data)
Records in Table2, around 3million
What I need to do:
1.)Link by the first 5 fields between the tables, update Table1 - MatchID (with a int - this integer I assign,1 = matched on all 5 fileds), CompleteCode from Table2
2.) Continue doing the above, but each time removing one of the fields, i.e. linking on 4 then 3 etc (and the matchId is 2 then 3 etc), but ignoring the records in Table1 that have a MatchID populated (so only do nulls).
I have done this in my novice SQL way as so;
I have indexed all 5 columns on each table to hopefully speed up the query.
This works, but one of the problems is that some updates have 1.6 million records, so the update query takes rather a long time.
With my limited experience of sql knowledge, "my" logic has led me to this path, but am wondering if there is a much more efficient way of doing this? Even if it means rather inserting the results into a new table, which I have tried and works much quicker, but I am stuck at how do I ignore the ones from table1 that have already been assigned a MatchID?
This is a once off for cleansing.
As usual, I appreciate any info of assistance.
Michael
Managment Studio
Hi All,
I have a 2 tables as such:
Table1
uniqueID,CC,MakeCode,ModelCode,Body,Date,MatchID,CompleteCode (First 6 columns have data, last 2 are null)
Records in Table1, around 45million
Table2
CC,MakeCode,ModelCode,Body,Date,CompleteCode (All columns have data)
Records in Table2, around 3million
What I need to do:
1.)Link by the first 5 fields between the tables, update Table1 - MatchID (with a int - this integer I assign,1 = matched on all 5 fileds), CompleteCode from Table2
2.) Continue doing the above, but each time removing one of the fields, i.e. linking on 4 then 3 etc (and the matchId is 2 then 3 etc), but ignoring the records in Table1 that have a MatchID populated (so only do nulls).
I have done this in my novice SQL way as so;
Code:
Update table1
set Table1.MatchID = 1, Table1.CompleteCode = table2.CompleteCode
FROM Table1 INNER JOIN
table2 ON Table1.CC = table2.CC AND Table1.MakeCode = table2.MakeCode AND Table1.ModelCode = table2.ModelCode AND
Table1.Body = table2.Body AND Table1.Date = table2.Date
where table1.matchid is null
I have indexed all 5 columns on each table to hopefully speed up the query.
This works, but one of the problems is that some updates have 1.6 million records, so the update query takes rather a long time.
With my limited experience of sql knowledge, "my" logic has led me to this path, but am wondering if there is a much more efficient way of doing this? Even if it means rather inserting the results into a new table, which I have tried and works much quicker, but I am stuck at how do I ignore the ones from table1 that have already been assigned a MatchID?
This is a once off for cleansing.
As usual, I appreciate any info of assistance.
Michael