Finding Duplicate Records 4

michhost · Sep 28, 2001

Does anyone know an easy way to find and mark for deletion duplicate records in a database? I've written a little program, but thought someone might have an easier way.

rgbean · Sep 28, 2001

The answer is greatly dependent on what you mean by "duplicate". Are the records duplicate if they have a single key field the same, or do ALL the fields need to be exactly the same. Or is it somewhere in between, where some fields in the record are the same but others may be the same or different.

The actually technique would likely also depend on what fields are indexed and possibilly on how many records there were in the table.

More details would help.

Rick

michhost · Sep 28, 2001

Thanks for the reply. Most often it would be a duplicate in one key field.

rgbean · Sep 28, 2001

One way I've used is to SELECT the duplicates, then just loop and delete these.

Code:

Select keyfield, count(*) as cnt;
  from MyTable ;
  group by keyfield ;
  having cnt > 1 ;
  into cursor mydups
SELECT MyTable
SET order to keyfield && create this if it doesn't exist
SELECT mydups
SCAN
  =SEEK(mytable.keyfield,&quot;Mytable&quot;,&quot;keyfield&quot;)
  ** Skip over first **
  SKIP +1 IN MyTable
  ** Delete rest **
  DELETE NEXT (mytable.cnt - 1) IN MyTable
ENDSCAN
USE IN mydups
USE IN MyTable

Rick

Scott24x7 · Sep 28, 2001

Michhost,
Another way that I have doen this is the past, and is very effective with "Single" field "Dups", is to do the following, which will work, even if you have several records in a row that are "Duplicate" It does however, require that you have an index on that field, or a compound index that at least starts with that field. In this example, I'll try to remove duplicate customer entries from a CUST table:

SELECT CUST
SET ORDER TO CUSTID
GO TOP
M.CUSTID = CUST.CUSTID
SKIP
DO WHILE NOT EOF()
IF CUST.CUSTID = M.CUSTID
DELETE
ENDIF
M.CUSTID = CUST.CUSTID
SKIP
ENDDO

This is handy, because it has all "CUSTID"'s in order. You simply save the "Previous" one to a memory variable, skip to the next record, and then compair it. This is very clever, works in all kinds of situations, and can be compound. The only hitch is you must have an index that at least matches the leading part of your expression. You can also use it for something like:
M.CHECKDUP = CUSTID+CUSTFNAME+CUSTLNAME
IF CUSTID+CUSTFNAME+CUSTLNAME = M.CHECKDUP
DELETE
ENDIF

If you want to use compound expressions.

Thanks,
-Scott

Please let me know if this has helped! s-)

michhost · Oct 1, 2001

Thanks for the help Rick and Scott. :

Mike

praz · Sep 26, 2002

DEFINE WINDOW wtalk FROM 0,0 TO 1,1
SET TALK WINDOW wtalk
SET TALK ON

SELE ;
cekfield,;
COUNT(*) AS dups ;
FROM filename ;
GROUP BY cekfield ;
HAVING COUNT(*)>1 ;
INTO CURSOR ganda

SET TALK OFF
RELEASE WINDOW wtalk

Foxtech · Oct 2, 2002

Hi Everyone,

This logic allows to delete the 2nd record. How can I do if I want to delete the 1st record.

thanks

mfernand1 · Oct 2, 2002

I hope help you

***************
PROCED PPL655AC
***************

SET DELETE OFF
USE PPLPDV INDEX PDVKEY
XKEY = SPACE(13)
CON = 0
DO WHILE .NOT. EOF()
KEY = STR(PMPREG,6,0)
IF XKEY <> KEY
XKEY = KEY
SKIP
LOOP
ENDIF
skip -1
DELETE
skip 2
ENDDO
PACK
SET DELETE ON
CLOSE DATA
RETURN

dgrewe · Oct 2, 2002

If you can have the DBF exclusive for a few seconds. Why do all the do while loops?

set deleted off
use DBF excludelete all
index on KeyField tag UKEY unique
set order to UKey
recall all
set order to
delete tag UKEY
pack
use
David W. Grewe
Dave@internationalbid.com

mfernand1 · Oct 3, 2002

Hi Dave, it works in foxpro 2.6 dos ? I'm trying but I can't

cayuga7 · Oct 3, 2002

The previous post from dgrewe has a typo.

The line:

"use DBF excludelete all"

Should be broken into two lines.

use dbf exclu

delete all

It should work fine in 2.6 for DOS.

Jack

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Finding Duplicate Records 4

michhost

Technical User

rgbean

Programmer

michhost

Technical User

rgbean

Programmer

Scott24x7

Programmer

michhost

Technical User

praz

Programmer

Foxtech

Programmer

mfernand1

Programmer

dgrewe

MIS

mfernand1

Programmer

cayuga7

IS-IT--Management

Similar threads

Part and Inventory Search

Sponsor