Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Finding Duplicate Records 4

Status
Not open for further replies.

michhost

Technical User
Sep 26, 2001
3
0
0
US
Does anyone know an easy way to find and mark for deletion duplicate records in a database? I've written a little program, but thought someone might have an easier way.
 
The answer is greatly dependent on what you mean by "duplicate". Are the records duplicate if they have a single key field the same, or do ALL the fields need to be exactly the same. Or is it somewhere in between, where some fields in the record are the same but others may be the same or different.

The actually technique would likely also depend on what fields are indexed and possibilly on how many records there were in the table.

More details would help.

Rick
 
Thanks for the reply. Most often it would be a duplicate in one key field.
 
One way I've used is to SELECT the duplicates, then just loop and delete these.

Code:
Select keyfield, count(*) as cnt;
  from MyTable ;
  group by keyfield ;
  having cnt > 1 ;
  into cursor mydups
SELECT MyTable
SET order to keyfield && create this if it doesn't exist
SELECT mydups
SCAN
  =SEEK(mytable.keyfield,"Mytable","keyfield")
  ** Skip over first **
  SKIP +1 IN MyTable
  ** Delete rest **
  DELETE NEXT (mytable.cnt - 1) IN MyTable
ENDSCAN
USE IN mydups
USE IN MyTable

Rick


 
Michhost,
Another way that I have doen this is the past, and is very effective with "Single" field "Dups", is to do the following, which will work, even if you have several records in a row that are "Duplicate" It does however, require that you have an index on that field, or a compound index that at least starts with that field. In this example, I'll try to remove duplicate customer entries from a CUST table:

SELECT CUST
SET ORDER TO CUSTID
GO TOP
M.CUSTID = CUST.CUSTID
SKIP
DO WHILE NOT EOF()
IF CUST.CUSTID = M.CUSTID
DELETE
ENDIF
M.CUSTID = CUST.CUSTID
SKIP
ENDDO

This is handy, because it has all "CUSTID"'s in order. You simply save the "Previous" one to a memory variable, skip to the next record, and then compair it. This is very clever, works in all kinds of situations, and can be compound. The only hitch is you must have an index that at least matches the leading part of your expression. You can also use it for something like:
M.CHECKDUP = CUSTID+CUSTFNAME+CUSTLNAME
IF CUSTID+CUSTFNAME+CUSTLNAME = M.CHECKDUP
DELETE
ENDIF

If you want to use compound expressions.

Thanks,
-Scott

Please let me know if this has helped! s-)
 
Thanks for the help Rick and Scott. ::)

Mike
 
DEFINE WINDOW wtalk FROM 0,0 TO 1,1
SET TALK WINDOW wtalk
SET TALK ON

SELE ;
cekfield,;
COUNT(*) AS dups ;
FROM filename ;
GROUP BY cekfield ;
HAVING COUNT(*)>1 ;
INTO CURSOR ganda

SET TALK OFF
RELEASE WINDOW wtalk
 
Hi Everyone,


This logic allows to delete the 2nd record. How can I do if I want to delete the 1st record.

thanks
 
I hope help you

***************
PROCED PPL655AC
***************

SET DELETE OFF
USE PPLPDV INDEX PDVKEY
XKEY = SPACE(13)
CON = 0
DO WHILE .NOT. EOF()
KEY = STR(PMPREG,6,0)
IF XKEY <> KEY
XKEY = KEY
SKIP
LOOP
ENDIF
skip -1
DELETE
skip 2
ENDDO
PACK
SET DELETE ON
CLOSE DATA
RETURN
 
If you can have the DBF exclusive for a few seconds. Why do all the do while loops?

set deleted off
use DBF excludelete all
index on KeyField tag UKEY unique
set order to UKey
recall all
set order to
delete tag UKEY
pack
use
David W. Grewe
Dave@internationalbid.com
 
Hi Dave, it works in foxpro 2.6 dos ? I'm trying but I can't
 
The previous post from dgrewe has a typo.

The line:

&quot;use DBF excludelete all&quot;

Should be broken into two lines.

use dbf exclu

delete all

It should work fine in 2.6 for DOS.

Jack
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top