Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations dencom on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Delete duplicates where primary key is not an integer 2

Status
Not open for further replies.

DCSage

Programmer
Mar 5, 2010
74
US
Hi.

I have the following data where the duplicates are the episode numbers. The problem is that there is a date associated with each record. I need to show the date as well as the program title in addition to the episode. I executed a delete process, but the duplicates are not removed.

Here's a sample of the data (there are thousands of records):

Code:
Air Date	Episode	                   Program 
 12/31/09 	 CMTW-09001T                 	MW
 12/31/09 	 CMTW-09001                  	MW
 12/31/09 	 CMTW-09001T                 	MW
 12/31/09 	 CMTW-09001                  	MW
 12/31/09 	 CMTW-09001                  	MW
 12/31/09 	 CRCA1044T                   	RFT
 12/31/09 	 CRCA1044                    	RFT
 12/31/09 	 CRCA1044                    	RFT
 12/31/09 	 CRCA1044T                   	RFT

I tried the following code, which did not give the right results. How can I revise my process so that the duplicate records are accurately removed?

Code:
/*Delete rows that have duplicate values for the episodes*/

BEGIN TRAN

SELECT Airdate, Episode, COUNT(Episode) as NumOccurences, Program,
FROM AsrunLogs
GROUP BY Airdate, Episode, Program
HAVING (Count(Episode) >1)

IF @@ERROR <> 0
   BEGIN
	ROLLBACK TRAN
   END


/*As there are various duplicates, select the duplicate values into a holding table*/

Select Airdate, Episode, Program, COUNT(*) as NumOccur
INTO HoldKey
FROM  AsrunLogs
GROUP BY Airdate, Episode, Program
HAVING (COUNT(*) > 1)

IF @@ERROR <> 0
   BEGIN
	ROLLBACK TRAN
   END

/*Select the duplicate rows into a holding table, eliminating duplicates in the process*/


Select Distinct AsrunLogs.* 
Into holdups
from AsrunLogs, holdkey
WHERE AsrunLogs.Episode= holdkey.episode

IF @@ERROR <> 0
   BEGIN
	ROLLBACK TRAN
   END

/*Delete the duplicate rows from original table*/

Delete AsrunLogs
From AsrunLogs a, HoldKey h 
WHERE a.Episode= h.episode

IF @@ERROR <> 0
   BEGIN
	ROLLBACK TRAN
   END

/*Put the Unique rows back in the original table*/


Insert INTO Asrunlogs (Airdate, Episode, Program)
SELECT Airdate, Episode, Program FROM holdups

IF @@ERROR <> 0
   BEGIN
	ROLLBACK TRAN
   END

Commit Tran

GO

 
how about
Code:
Select distinct [Air Date],episode , Program
 into #distinctTable
From AsrunLogs

delete
From AsrunLogs 

insert into AsrunLogs ([Air Date],episode , Program)
Select distinct [Air Date],episode , Program
From #distinctTable
 
Hi. Does this mean that there won't be duplicates with episodes? Or is this selecting unique records based on airdate?
 
If I understood right this is your data
Code:
Air Date    Episode                       Program  12/31/09      CMTW-09001T                     MW 12/31/09      CMTW-09001                      MW 12/31/09      CMTW-09001T                     MW 12/31/09      CMTW-09001                      MW 12/31/09      CMTW-09001                      MW 12/31/09      CRCA1044T                       RFT 12/31/09      CRCA1044                        RFT 12/31/09      CRCA1044                        RFT 12/31/09      CRCA1044T                       RFT


and you want this

Code:
Air Date    Episode                       Program  12/31/09      CMTW-09001T                     MW 12/31/09      CMTW-09001                      MW  12/31/09      CRCA1044                        RFT 12/31/09      CRCA1044T                       RFT


 
PWise, you are correct. I wanted to keep

Code:
Air Date    Episode                       Program  
12/31/09      CMTW-09001T                     MW 
12/31/09      CMTW-09001                      MW  
12/31/09      CRCA1044                        RFT 
12/31/09      CRCA1044T                       RFT

Thank you.
 
Your problem is you do not have a primary key. It is easiest to do this if you add an identity field to the table. Otherwise, indeed you will need to copy distinct records to another table, delete all the table records and then reinsert the records. This is because you currently have no way to distiguish a dup as a specific record. If you use that method instead of adding an identity, then as soon as you finish, immediately createa a real primary key or unique index. No table should ever be without a way to uniquely identify a record.

Do not do any of these processes wihtout a good backup in place.

"NOTHING is more important in a database than integrity." ESquared
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top