Speeding up this .prg

inteleserve · Mar 31, 2004

Does anyone know how I can speed this up? Its just not cutting it anymore with 2,000,000 record lists. Its just a simple program that finds two records with identicle phone number fields and tags them both with "X" for deletion when completed. It would really help me out....

Thanks,

Ross

*********************************************************

close all
clear
use temp.dbf
index on phone tag phone
use temp.dbf order phone exclusive
set status off
set scoreboard off
set console off
set device to screen
STORE RECCOUNT() to Xtotal
STORE 0 TO nHowMany
DO WHILE !EOF()
nHowMany = nHowMany + 1
IF nHowMany % 1000 = 0 &&... every 1000 records
@12,20 say str((nHowMany / Xtotal)*100) + "% done deduping..."
ENDIF
store phone to p1
skip 1
store phone to p2
if p1=p2
replace state WITH "X"
skip -1
replace state WITH "X"
skip 1
endif
enddo

inteleserve · Mar 31, 2004

Sorry... I am using foxpro 8.0 for windows.

Peace

Mike Gagnon · Mar 31, 2004

Im not sure about speed, I have nothing to test.

Code:

SELE COUNT(phone) - 1 as Dupes,phone ;
FROM myTable ;
HAVING COUNT(phone) > 1 ;
ORDER BY 2 ;
GROUP BY phone INTO cursor myCursor

Now you can loop through your table that holds the duplicates against your original table

Mike Gagnon

If you want to get the best response to a question, please check out FAQ184-2483 first.

baltman · Mar 31, 2004

SELECT * from temp GROUP BY phone

If you specifically want the most recent value, you'd need a way to 'know' which one is later, such as an date.

Brian

inteleserve · Mar 31, 2004

What I am looking for is to have a new list with both records that had identical phone fields to be tagged a certain way. So I could:

1. know which records were duplicated

2. if a record was found more then once, they need to be tagged a certain way so all of the 5073455611 phone numbers can be taken out.

Mike Gagnon · Mar 31, 2004

inteleserve

The suggestion I made gives you a cursor of all records that are duplicates. Do you need more?

Mike Gagnon

If you want to get the best response to a question, please check out FAQ184-2483 first.

baltman · Mar 31, 2004

This code leaves you with only non-duplicates:
SELECT * from temp GROUP BY phone

To get the list of what was duplicate:
SELECT *,sum(1) as N from temp GROUP BY phone HAVING n>1

mspratt · Apr 1, 2004

Hi Ross

Change your index to structural (.cdx) and save yourself the time of indexing. I think that it also helps with Rushmore if you decide to change to select.

If you have 2,000,000 records and progress is being shown as a 0 - 100% you should only change the display every 20,000 records.

store phone to p1
skip 1
if p1=phone && Why take the time to "store" to P2?
replace state WITH "X"
skip -1
replace state WITH "X"
skip 1
endif

Test this code to see how much time you can save not doing a modulus.

tnHowMany = tnHowMany + 1
IF nHowMany = 1000 &&... every 1000 records ...20,000???
nHowMany = nHowMany + 1000 && initialize nHowMany at program start
@12,20 say str((nHowMany / Xtotal)*100) + "% done
tnHowMany = 0
deduping..."
ENDIF

Regards,

Mike

mspratt · Apr 5, 2004

Oops!
Typo.
IF tnHowMany = 1000

Regards,

Mike

PS Did you intend to mark both entries? For deletion, or what?

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Speeding up this .prg

inteleserve

IS-IT--Management

inteleserve

IS-IT--Management

Mike Gagnon

Programmer

baltman

Technical User

inteleserve

IS-IT--Management

Mike Gagnon

Programmer

baltman

Technical User

mspratt

Programmer

mspratt

Programmer

Similar threads

Part and Inventory Search

Sponsor