Table Of Grouped Duplicate Records. Statistics needed

Felix18807 · Nov 23, 2011

I have a query which groups together duplicate accounts within a database. It has the following fields

GroupID, AccountIDs, AccountCreatedDate, AddressPopulation

(GroupID indicates thats the AccountIDs belong together)

What I want to be able to produce is

1) A Count of AccountIDs within a GroupID Where AccountCreatedDate is the most recent (max) within that group AND where AddressPopulation is the maximum within that group.

AND

2) A Count as above where the most recent is NOT the most populated..

The goal being to prove/disprove if the most recently produced accounts contain the largest amount of address information for the purposes of merging accounts.

I hope this makes sense...

I am either being silly and missing something basic or this involves subqueries which are not my forte

Many thanks for reading this.

Andrzejek · Nov 23, 2011

It would be helpful if you would show a sample of your data showing the problem you want to address, and the outcome you are hoping for. A picture is worth a 1000 words of explaining....

Have fun.

---- Andy

Felix18807 · Nov 23, 2011

I'm having trouble making sense of it myself.

Here is some sample data

GroupID AccountIDs AccountCreatedDate AddressPopulation
1 51223 06/09/2010 20:04:24 1
1 51600 04/11/2011 12:14:28 4
1 55460 04/11/2011 13:14:28 5
2 73353 04/11/2011 12:14:28 4
2 131134 02/09/2011 11:14:21 4
3 130881 04/11/2010 12:14:28 5
3 131480 04/11/2011 12:14:28 2
4 12331 01/01/2009 10:21:00 5
4 70741 04/11/2011 12:14:28 3
4 81433 04/11/2011 12:14:28 2
4 119614 04/11/2011 12:14:28 1
5 53359 02/03/2009 11:10:10 4
5 53366 06/09/2010 20:04:24 3
6 11138 07/10/2011 13:04:07 4
6 80751 04/11/2011 12:14:28 5
6 82076 06/09/2010 20:04:24 3
7 130974 04/11/2011 12:14:28 5
7 131162 04/11/2011 12:14:28 4

Groups 1,2,3,6 and 7 have the largest address population against the most recent date and should therefore be counted

The result should look something like this
Count of Max Pop AND Max Date = 5
Count where above isnt true = 2

I hope this helps to clarify my dilema.

My latest attempt is here (some of the fieldnames etc are slightly different as this is a work in progress and I tried to simplify the nature of the issue)

Code:

 SELECT DISTINCT [Results].GroupID, Last([Results].AccountIDs) AS SelectedAccountID
FROM (SELECT [Group Criteria] & " - " & [Group Postcode] AS GroupID, Max([Results].AddressPopulation) AS MaxPop FROM [Results] GROUP BY [Group Criteria] & " - " & [Group Postcode])  AS [Parsed Duplicates MaxPop] INNER JOIN ((SELECT [Group Criteria] & " - " & [Group Postcode] AS GroupID, Max([Results].[Account Modified]) AS LastDate FROM [Results] GROUP BY [Group Criteria] & " - " & [Group Postcode])  AS [Parsed Duplicates LastDate] INNER JOIN [Results] ON ([Parsed Duplicates LastDate].GroupID = [Results].GroupID) AND ([Parsed Duplicates LastDate].LastDate = [Results].[Account Modified])) ON ([Parsed Duplicates MaxPop].GroupID = [Results].GroupID) AND ([Parsed Duplicates MaxPop].MaxPop = [Results].AddressPopulation)
GROUP BY [Results].GroupID
ORDER BY [Results].GroupID;

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Table Of Grouped Duplicate Records. Statistics needed

Felix18807

Programmer

Andrzejek

Programmer

Felix18807

Programmer

Similar threads

Part and Inventory Search

Sponsor