Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Delete query question - using Alias to find duplicates ? 1

Status
Not open for further replies.

huskerdon

Programmer
Aug 26, 2002
67
US
Hello,

There may be a very easy solution to this, but I've been stumped on this one. I have a table with thousands of records, and somehow, there got to be (kind of) duplicate records. Below is just a sample table using similar fields.

The HistID field is the PK, and ItemID is the number of the object that gets moved from one location to another. The IsCurrent field tells whether the record shows the current location for a particular item number.

What I need to do is find records where the ItemID, Location, and InstDate fields are IDENTICAL. There's not supposed to be any instances where these 3 fields are the same, but there are, and these duplicates are records that got added accidentally.

I can't just search for duplicate records where everything in the entire row is the same, because the HistID value and Yes/No values will always be different. If it's possible to delete the extra records within the same query that would be the best solution. If it needs to be done with 2 separate queries, that's OK also.

I've managed to come up with a list of about 500 records out of the original 20,000 that needed to be deleted, but when I try to make it a delete query and run it, is says "Could not delete from the specified tables." Then I tried using Aliases to return records where the 3 fields mentioned above were identical, but it is returning all of the records.

HistID ItemID Location InstDate IsCurrent
1 1 555 S Apple 1/1/2004 Yes
2 2 2727 W Maple 12/13/2002 No
3 3 247 W Jack Rd 4/6/2000 Yes
4 4 234 N Main 6/7/2004 Yes
5 5 2525 Oak St 2/17/2000 Yes
6 6 3579 NW 42nd 7/15/1999 Yes
7 2 2727 W Maple 5/27/2004 Yes
8 4 234 N Main 6/7/2004 No
9 9 2702 Pine St 3/15/1996 Yes
10 10 275 Park Rd 8/22/1995 Yes
11 11 1212 W 24th 10/16/1999 Yes
12 12 333 S Locust 3/3/2004 Yes
13 13 321 N 17th 6/11/1999 Yes
14 14 765 E Jones Ave 9/2/2002 Yes
15 12 333 S Locust 3/3/2004 No

In the sample table above, I would like to delete only 2 records - with HistID of 8 and 15. (The 2 records with ItemID = 2 would NOT be affected because the InstDate is different.) I hope this makes some sense - I just don't want to go thru 20,000 records and delete 500 duplicates manually.

Thanks!
 
Somethng like this ?
DELETE FROM yourTable As A INNER JOIN
(SELECT ItemID, Location, InstDate FROM yourTable
GROUP BY ItemID, Location, InstDate HAVING Count(*)>1) As B
ON (A.ItemID=B.ItemID) And (A.Location=B.Location) And (A.InstDate=B.InstDate)
WHERE A.IsCurrent="No"

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
PHV,

I think that's close to what I need. I tried your code, replacing "yourTable" with the actual table name "tblHist" and adding "tblHist.*" between "Delete" and "From".

When I try to run it, it's asking for paramter values. Do I need to specify the table name (tblHist, A, or B ??) before the 3 field names in the Select and Group by statements? I haven't had much experience using aliases.

When I do put the table name to qualify the fields, it still asks for parameter values, or says it's too complex to be evaluated. Can it be split into separate queries?

Thanks in advance for your help.
 
Create a saved query, say qryDupHist:
SELECT ItemID, Location, InstDate FROM tblHist
GROUP BY ItemID, Location, InstDate HAVING Count(*)>1;
Now we can try this delete query:
DELETE A.* FROM tblHist As A INNER JOIN qryDupHist As B
ON (A.ItemID=B.ItemID) And (A.Location=B.Location) And (A.InstDate=B.InstDate)
WHERE A.IsCurrent="No";
Obviously you do the testing in a copy of the database...

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
PHV,

Thanks again for your response. When I open the new query (based on the first query) in datasheet view - it DOES return the 2 correct records.

However, when I try to actually run the delete query, I still get the message "Could not delete from the specified tables." What causes this type of message?
 
And this ?
DELETE FROM tblHist As A
WHERE EXISTS (SELECT * FROM qryDupHist As B WHERE
A.ItemID=B.ItemID And A.Location=B.Location And A.InstDate=B.InstDate)
AND A.IsCurrent="No";


Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
PHV,

That did the trick! Have a star. I've been using Access for 8 years and have never used "Exists". Guess I need to read about that.

Thanks for your help,

Don
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top