Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Selecting non-Distinct (Duplicate) Records using 2 fields

Status
Not open for further replies.

LMGroup

MIS
Apr 10, 2006
85
CA
I'm trying to select records from a table where there is no key. I'm using SQL Server 2005. Because of some data entry errors, I need to isolate all the records where they have entered StatusCode 90099 for the same JobN and then I need to change all but the last occurrence of the StatusCode to 90099 for each JobN this occurs in.

This seems to give me the records, but I can't see or change them:

Code:
SELECT     JobN, StatusCode, COUNT(StatusCode) AS StatNum
FROM         JobStatus1
WHERE     (StatusCode = 90099)
GROUP BY JobN, StatusCode
HAVING      (COUNT(StatusCode) > 1)
ORDER BY JobN

Thanks for your help!
 
then I need to change all but the last occurrence "
On what criteria would you order the records? I'm guessing that there will be a log or entry date, so MAX or TOP is going to get you the last.

soi là, soi carré
 
Sorry. There is a transaction number for each status code in every job. The status code for the highest transaction number will remain at 90099. Any other 90099s in that job (there could be many other status codes too) have to be changed to 90097.
 
OK; on the basis that Transaction number is TransactionN, I'd code something like this:
Code:
UPDATE JobStatus1
Set StatusCode = 90097
FROM JobStatus1 T1 INNER JOIN 
(Select JobN, Max(TransactionN) as MaxNo from JobStatus1 WHERE StatusCode = 90099)T2
on T1.JobN = T2.JobN and T1.TransactionNo <> T2.MaxNo
WHERE T1.StatusCode = 90099

In the words of Boris - "not tested".

Check what would be changed with a SELECT version:
Code:
Select T1.*
FROM JobStatus1 T1 INNER JOIN 
(Select JobN, Max(TransactionN) as MaxNo from JobStatus1 WHERE StatusCode = 90099)T2
on T1.JobN = T2.JobN and T1.TransactionNo <> T2.MaxNo
WHERE T1.StatusCode = 90099
Order by JobN, TransactionN

soi là, soi carré
 
I'm getting this error when I try the SELECT version. I copied the code exactly, except that I had to change TransactionNo to TransactionN.

Column 'JobStatus1.JobN' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. Ambiguous column name 'JobN'.
 
D'oh - forgot the GROUP BY in the T2.
Code:
UPDATE JobStatus1
Set StatusCode = 90097
FROM JobStatus1 T1 INNER JOIN 
(Select JobN, Max(TransactionN) as MaxNo from JobStatus1 WHERE StatusCode = 90099 GROUP BY JobN)T2
on T1.JobN = T2.JobN and T1.TransactionNo <> T2.MaxNo
WHERE T1.StatusCode = 90099

In the SELECT test, last line should be ORDER BY T1.JobN, TransactionN

soi là, soi carré
 
Wow! That turned into a nightmare. The select statement showed me exactly the records I wanted to change (2362 of them). When I ran the update, it changed the entire table (562,000 plus records). I am now manually trying to restore them.

I'm trying to see the difference in my code vs yours...

Code:
UPDATE    JobStatus1
SET              StatusCode = 90097
FROM         JobStatus1 AS T1 INNER JOIN
                          (SELECT     JobN, MAX(TransactionN) AS MaxNo
                            FROM          JobStatus1 AS JobStatus1_1
                            WHERE      (StatusCode = 90099)
                            GROUP BY JobN) AS T2 ON T1.JobN = T2.JobN AND T1.TransactionN <> T2.MaxNo CROSS JOIN
                      JobStatus1
WHERE     (T1.StatusCode = 90099)
 
Looks like my compiler may have gone nuts. I didn't type JobStatus1_1. I wouldn't have even thought to name something that way. Looks like I have my answer. Still have to clean up the mess though. Thanks anyway.
 
And where did that CROSS JOIN come from? I don't even know what that's supposed to do. Sometimes I hate computers :p

How would I have set that up so I could undo or rollback the changes? Is that something I can do for everything I attempt? This is a mess.
 
How would I have set that up so I could undo or rollback the changes? Is that something I can do for everything I attempt?

With SQL Server Management Studio Tools Pack (free), it adds a BEGIN TRAN and ROLLBACK every time you open a new query window. Since you didn't have this (or do the transaction thing), you cannot easily rollback your changes. You could (potentially) restore your database to a new DB and then copy of the rows from the affected table, but that may be even messier if you had new rows between the time you did your backup and the time you ran the update query.


-George
Microsoft SQL Server MVP
My Blogs
SQLCop
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
I think your addition of "CROSS JOIN JobStatus1" is to blame for your entire table update, as such removes the matching of Job and Transaction numbers.

It's good practice, as a computer professional, to take back-ups or use ROLLBACK prior to running code on live systems.


soi là, soi carré
 
Sorry to hear about your 'goof'. Another way to make an 'emergency' copy of the data is to copy the table to a backup table before you run the code.

For example: Since you were updating JobStatus1, you could have run this:
Code:
SELECT * INTO Temp_JobStatus1
FROM JobStatus1

Then when the update was determined to be successful, you delete the temporary version.

You could also do that same trick for testing of your script. Make the temp copy and change your table name in the script to use the temp copy. If it works, then change the table names back and run it on the real table.

Only 'gotcha' with doing that is if there are constraints/triggers that will change data elsewhere...you would miss those changes.

-SQLBill

The following is part of my signature block and is only intended to be informational.
Posting advice: FAQ481-4875
 
Thanks all. I am running SQL Server 2005 but didn't seem to be able to rollback. I must not be setting it up right?

I usually work in a copy of the db to make sure things are working the way I want. I don't know why I didn't this time.

I did do the SELECT before I attempted the UPDATE. It gave me exactly what I wanted. Then I just removed the SELECT and ORDER and added the UPDATE and SET statements. I didn't notice that, when I clicked to check for syntax issues, the code changed.

The JobStatus1_1 alias was not something I added and neither was the CROSS JOIN. I wouldn't add code that I'm not familiar with. It must be like auto correct. I just cut and pasted the code from here. I know that when I click to check it, the compiler reformats the layout of the code and tosses in parenthesis and the "AS" in front of the aliases. I can't believe it added that other stuff and I didn't notice it. I didn't even see the CROSS JOIN statement until more than 3 hrs later when I was looking at it again to see what went wrong.

I've recovered everything that I could from the nightly backup. We're missing some of the status changed from yesterday but they aren't critical and will take more time to recreate than its worth.

If anyone has any recommendations on good sites relating to MS SQL Server rollback use, I'd really appreciate it.
 
By Rollback we meant

BEGIN TRANSACTION
UPDATE ....
SELECT * from tableJustUpdated
ROLLBACK TRANSACTION

If the result looks correct, then instead of last ROLLBACK you write COMMIT TRANSACTION to actually commit the changes to SQL Server.

PluralSight Learning Library
 
Thank you markros. I will be doing some playing with that on my db copy. I don't want this to happen again.

Incidentally, I did some testing with SQL Server yesterday. My previous comments were correct. It takes the SELECT code segment and runs it "as is". When I alter it to the UPDATE version of the code, it does add the JobStatus1_1 and CROSS JOIN bits. I deleted them and checked again and it added them right back. How do I get around that?
 
It changes it when I click "Verify SQL Syntax". I deleted the extra pieces and clicked it again and it adds it back again. It also reformats the layout to make it conform to the indents and spacing that it wants to see. That's no big deal. But added extra information is. I don't understand why it doesn't add it for the select statement too.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top