Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Remove Duplicates -- NEW TWIST

Status
Not open for further replies.

gregas

Programmer
May 21, 2001
22
US
My apologies for not being more clear the first time -- I figure this will change the responses this time.

The table in question is a Customer table that tracks Customers as they enter data from a Web form. I did not design the table nor the database (nor can I change the existing design). Customers enter data on a form, along with some cc info. Only the Customer data is stored in the Customer table. However, the application (cc verification, esp) has been known to throw errors when processing the form -- but not without the database firing the update.

In short, multiple records (WITH DIFFERENT NON-DISTINCT CustomerIDs) can easily appear in the Customer table for a single cc transation.

The task is to create a query/macro which can be used often to clean out the customer table and return only distinct customer records.

Thanks in advance for the help.

-aaron
 
What's the primary key of the table? If it's customer ID it must tolerate duplicates (not good).

What's the determining criteria for a good record when dups exist? (first entry, last entry, some unique field)

Dave
 
Much thanks for responding.

As you can attest, the design of the table is attrocious. Yes, the PK is CustomerID and it does allow duplicates. Not only that, but it's type double (not int). Unfortunately, that's the way this table is set up and I have to deal with it as such.

Just to let you know -- my original plan was to create a module with a series of SQL statements (the SQL for this is too long to insert into a macro).

The steps I would take are:

* Determine the duplicate records (comparing maybe 10 fields to see if ALL are identical) and the max(CustID) and group by these fields. Dump these into a temp table.

* Next, delete ALL the duplicates from the temp table.

* Insert the temp records into the Customer table.

Any thoughts on this approach?

-aaron

*
 
With the DB under exclusive control by you

back up table in question

create a temp table of records with PK and count of recs with that PK from the main table in question

walk thru that table selecting each rec with a count > 1

for each rec above, create a recordset selecting that PK from the main table in question

determine the desired rec(s?) to retain (the tough part)

delete main table recs for the selected PK

insert the desired rec

repeat for each rec with count > 1 till done


watch out for referential integrity and cascading deletes in the main table


Dave

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top