Delete duplicates where primary key is not an integer 2

DCSage · Apr 21, 2010

Hi.

I have the following data where the duplicates are the episode numbers. The problem is that there is a date associated with each record. I need to show the date as well as the program title in addition to the episode. I executed a delete process, but the duplicates are not removed.

Here's a sample of the data (there are thousands of records):

Code:

Air Date	Episode	                   Program 
 12/31/09 	 CMTW-09001T                 	MW
 12/31/09 	 CMTW-09001                  	MW
 12/31/09 	 CMTW-09001T                 	MW
 12/31/09 	 CMTW-09001                  	MW
 12/31/09 	 CMTW-09001                  	MW
 12/31/09 	 CRCA1044T                   	RFT
 12/31/09 	 CRCA1044                    	RFT
 12/31/09 	 CRCA1044                    	RFT
 12/31/09 	 CRCA1044T                   	RFT

I tried the following code, which did not give the right results. How can I revise my process so that the duplicate records are accurately removed?

Code:

/*Delete rows that have duplicate values for the episodes*/

BEGIN TRAN

SELECT Airdate, Episode, COUNT(Episode) as NumOccurences, Program,
FROM AsrunLogs
GROUP BY Airdate, Episode, Program
HAVING (Count(Episode) >1)

IF @@ERROR <> 0
   BEGIN
	ROLLBACK TRAN
   END


/*As there are various duplicates, select the duplicate values into a holding table*/

Select Airdate, Episode, Program, COUNT(*) as NumOccur
INTO HoldKey
FROM  AsrunLogs
GROUP BY Airdate, Episode, Program
HAVING (COUNT(*) > 1)

IF @@ERROR <> 0
   BEGIN
	ROLLBACK TRAN
   END

/*Select the duplicate rows into a holding table, eliminating duplicates in the process*/


Select Distinct AsrunLogs.* 
Into holdups
from AsrunLogs, holdkey
WHERE AsrunLogs.Episode= holdkey.episode

IF @@ERROR <> 0
   BEGIN
	ROLLBACK TRAN
   END

/*Delete the duplicate rows from original table*/

Delete AsrunLogs
From AsrunLogs a, HoldKey h 
WHERE a.Episode= h.episode

IF @@ERROR <> 0
   BEGIN
	ROLLBACK TRAN
   END

/*Put the Unique rows back in the original table*/


Insert INTO Asrunlogs (Airdate, Episode, Program)
SELECT Airdate, Episode, Program FROM holdups

IF @@ERROR <> 0
   BEGIN
	ROLLBACK TRAN
   END

Commit Tran

GO

PWise · Apr 21, 2010

how about

Code:

Select distinct [Air Date],episode , Program
 into #distinctTable
From AsrunLogs

delete
From AsrunLogs 

insert into AsrunLogs ([Air Date],episode , Program)
Select distinct [Air Date],episode , Program
From #distinctTable

DCSage · Apr 21, 2010

Hi. Does this mean that there won't be duplicates with episodes? Or is this selecting unique records based on airdate?

PWise · Apr 21, 2010

If I understood right this is your data

Code:

Air Date    Episode                       Program  12/31/09      CMTW-09001T                     MW 12/31/09      CMTW-09001                      MW 12/31/09      CMTW-09001T                     MW 12/31/09      CMTW-09001                      MW 12/31/09      CMTW-09001                      MW 12/31/09      CRCA1044T                       RFT 12/31/09      CRCA1044                        RFT 12/31/09      CRCA1044                        RFT 12/31/09      CRCA1044T                       RFT

and you want this

Code:

Air Date    Episode                       Program  12/31/09      CMTW-09001T                     MW 12/31/09      CMTW-09001                      MW  12/31/09      CRCA1044                        RFT 12/31/09      CRCA1044T                       RFT

markros · Apr 21, 2010

What is your SQL Server version and which of the duplicates you'd like to keep?

Take a look here

http://social.technet.microsoft.com...l/thread/98ef87bb-6610-4223-ab20-085e4c172c06

PluralSight Learning Library

DCSage · Apr 21, 2010

PWise, you are correct. I wanted to keep

Code:

Air Date    Episode                       Program  
12/31/09      CMTW-09001T                     MW 
12/31/09      CMTW-09001                      MW  
12/31/09      CRCA1044                        RFT 
12/31/09      CRCA1044T                       RFT

Thank you.

SQLSister · Apr 21, 2010

Your problem is you do not have a primary key. It is easiest to do this if you add an identity field to the table. Otherwise, indeed you will need to copy distinct records to another table, delete all the table records and then reinsert the records. This is because you currently have no way to distiguish a dup as a specific record. If you use that method instead of adding an identity, then as soon as you finish, immediately createa a real primary key or unique index. No table should ever be without a way to uniquely identify a record.

Do not do any of these processes wihtout a good backup in place.

"NOTHING is more important in a database than integrity." ESquared

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Delete duplicates where primary key is not an integer 2

DCSage

Programmer

PWise

Programmer

DCSage

Programmer

PWise

Programmer

markros

Programmer

DCSage

Programmer

SQLSister

Programmer

Similar threads

Part and Inventory Search

Sponsor