When there is a duplicate, I want to keep 1 and delete the dups. The first field is where the dupd. The other fields might have dups as well but I don't care.
Thanks Rhys. There is no identity field in the table. If there are 2 or more identical records(I meant 2 records with same first field, I want to keep just one.
The below should do the trick but you'll need to configure the datatype for the @fstField variable to be the same as fstField in the table you're playing with.
I've used fstField for the Field Name you're trying to get to hold only distinct values, and TableName for the name of the Table
<CODE>
DECLARE @fstField fstfieldDataType
SELECT DISTINCT * INTO #tmp_Table FROM TableName
TRUNCATE TABLE TableName
DECLARE cur_Cases CURSOR FOR SELECT DISTINCT fstField FROM #tmp_Table
OPEN cur_Cases --Open the CURSOR
FETCH NEXT FROM cur_Cases INTO @fstField
WHILE @@FETCH_STATUS = 0 --As Long as we got data keep going.
BEGIN
INSERT INTO TableName SELECT TOP 1 * FROM #tmp_Table WHERE fstField = @fstField
/* Get the next value FROM the CURSOR AND put IN variable. */
FETCH NEXT FROM cur_Cases INTO @fstField
END
CLOSE cur_Cases --We no longer need CURSOR so close
DEALLOCATE cur_Cases --and free memory
DROP TABLE #tmp_Table
GO
</CODE>
Basically this makes a copy of your table to a temporartu table, then truncates the original. It then uses a cursor for the distinct values held in the field you want to make distinct, and for every distinct value it inserts back from the temporary table to the original table the TOP 1 records matching by the field value, so in 'INSERT INTO TableName SELECT TOP 1 * FROM #tmp_Table WHERE fstField = @fstField', even if there are 30 rows with the same value in fstField it'll only put the first of them back to your original.
OK?
Post back if you're confused, or to let me know how it goes...
Rhys
Be careful that the light at the end of the tunnel isn't a train coming the other way.
Ugh, don't need cursors! Temp tables or anything. THe FAQ on deleting records gave a method to delete in one step. This is the FAQ I'm referring to:
Handle duplicate records
faq183-2682
This is the code I just used to delete duplicate records from a table I'm importing. If you don't have an id field I suggest putting an identity field on the table first so it can use this method.
Delete dbo.VACharacteristics from
(Select CharacteristicID, MenuItemID, AttractionID from VACharacteristics
where exists(
Select NUll from VACharacteristics b
Where b.MEnuItemID = VACharacteristics.MenuItemID
and B.AttractionID = VACharacteristics.AttractionID
Group BY
b.MenuItemID, B.AttractionID
Having
VACHAracteristics.CharacteristicID <Max(b.CharacteristicID))) a
Where A.CharacteristicId = VACHARACTERISTICs.CharacteristicID
Suggest you run the select part first to ensure you are getting the records you want. I didn't need to return all three fields for the delete to work, I just put them in so I could check the data in the select before I deleted things.
If this is not a table you are importing from another source, I also suggest you fix the database structure and user interface to avoid having this happen again.
Thank you for the code. It seems to work but...it's kinda unusual. I ran that procedure yesterday around noon, and it is still running now. It's almost 24 hrs. I have about 1.5 mil records. Is it usual? Please advise.
I still have problem to remove the dups. Anyone can help???
Like I said, my table doesn't have pk so I use Rhys666's method. I have 150 records, each has 12 fields. It never finish running with (0 row affected) each line.
A cursor will take forever to run on 1.5 million records. (you said this earlier, although in the last post you said 150 records) No table with this number of records should EVER be without a primary key or some other unique identifier!! GO to your boss and have him overrule the idiot who won't let you put one on the table. If necessary, put an identity field on temporarily til you clean up the records and then remove it!
I hesitate to ask, but you are doing this on a development server, right? You need to test your method on development before you affect the records in a production table of this size. I'm also concerned that you don't care at all what the other fields contain, how can you be sure you are keeping the record you want to keep if the data inthe other fields is different?
Oh yeah, if it really is 150 records, you could have removed the duplicates manually by now.
Thank you SQLSister. You brink up a very good point.
Let me expain alittle bit more about this table.
Not that I don't care about other fileds. I just want to remove some dups in filed1 so I can do a COUNT for a report. This one a copy(temp) of the production table.
Data is load from outside source. Some id#s have more than 1 products, and one products can be owned by an id#. That's why they didn't a PK on the table.
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.