Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Select Distinct: How to include fields that are not being grouped

Status
Not open for further replies.

montypython1

Technical User
Jan 12, 2005
187
US
Greetings,

How do I display fields that are not being grouped together using the "Select Distinct" SQL command?

My database is called "EmplTest" and it contains the following fields:
Emp_Key SSN Emplast Updated
------- ----------- ----------- ---------
4223 111-11-1111 ABLE 04/15/2001
4505 111-11-2222 BAKER 01/01/2002
4700 111-11-2222 BAKER 12/31/2005
etc.

I simply need to select the maximum "Emp_Key" [ MAX(Emp_Key) ] for all identical "SSN" records, but I need to display all the fields, not just the fields being grouped.

My desired output would look like this (notice Emp_Key "4505" does not display):
Emp_Key SSN Emplast Updated
------- ----------- ----------- ---------
4223 111-11-1111 ABLE 04/15/2001
4700 111-11-2222 BAKER 12/31/2005

When I use the following code, it gives me the maximum "Emp_Key" value and the "SSN", but does not display the associated fields:
SELECT DISTINCT ssn, MAX(Emp_Key) FROM EmplTest GROUP BY ssn ORDER BY ssn

When I use the following code, it generates an error message, since I am not grouping the additional fields:
SELECT DISTINCT ssn, MAX(Emp_Key), EmpLast, Updated FROM EmplTest GROUP BY ssn ORDER BY ssn

Before posting the question, I reviewed Tek-Tips thread184-1411960, and I initially thought that I should be using the "Select Distinct" SQL command, but maybe I'm barking up the wrong tree.

Any suggestions?

Thanks,
Dave Higgins

 
First you do not need DISTINCT in your SQL.

I am assuming that Emp_Key is a unique primary key.

IF that is the case this should work.

select ssn, Emp_Key, EmpLast, Updated from EmplTest ;
where Emp_Key in (SELECT MAX(Emp_Key) FROM EmplTest GROUP BY ssn) order by ssn




Alan
 
The DISTINCT becomes redundant when you use the GROUP BY option.

You might try:
SELECT ssn,;
EmpLast,;
Updated ,;
MAX(Emp_Key) As MaxKey;
FROM EmplTest ;
GROUP BY ssn, EmpLast, Updated;
ORDER BY ssn

Since you want the other fields to be distinct, include them in the GROUP BY expression and they will, by the nature of "GROUP BY", be distinct.

Good Luck,
JRB-Bldr
 
JRB-Bldr, you code will not work unless you "SET ENGINEBEHAVIOR 70".

But if you do that their is no guarantee what values will be returned in the columns not in the group by and I'll bet he wants the correct emplast and updated values for the record with the max key.

Alan
 
JRB-Bldr, oops miss read your post and their does not seem to be an edit here.

Their is still an issue as two users could be in the table having edited the SSN so your statement could return multiple rows for a given SSN.

Alan
 
"as two users could be in the table having edited the SSN so your statement could return multiple rows for a given SSN."

Alan - And the problem would be ???

If that were to occur, then the edits would result in 2 distinctly different SSN's and, as a result you would not - return multiple rows for a given SSN. - since the SSN's would now be different and no longer the same given SSN.

And, based on what Dave first asked - "maximum "Emp_Key" [ MAX(Emp_Key) ] for all identical "SSN" records" - you should result in 2 distinct rows for the 2 newly edited different SSN's.

Good Luck,
JRB-Bldr
 
Hi Alan and JRB-Bldr,

Thank you both for your ideas.

I just tested both suggestions and it appears that Alan's suggestion will do exactly what I need.

JRB-Bldr's code was close to what I was looking for, but generated duplicate records (since the field 'updated' contains unique values if an employee was hired, then left the company, and then was rehired).

I'm amazed at both of your select statements: they are short and sweet. Nothing like my earlier attempts trying all sorts of different and lengthy code (complete with SCAN/ENDSCAN loops).

Again, thank you both for taking the time to help me figure out a solution.

Thanks,
Dave Higgins
 
JRB-Bldr,

My apologies ... I should have clarified that I wanted to suppress all duplicate SSNs, and show only the most recent record for each SSN (which is represented by the highest value in the Emp_Key field). I would not need to see the other duplicate records, other than the most recent.

Thanks,
Dave
 
At first look:
Code:
Select * From EmplTest t1 ;
  INNER Join (Select  Max(Emp_Key) As Emp_Key From EmplTest Group By SSN) t2 ;
  ON t1.Emp_Key = t2.Emp_Key

However it fits into this one logically:
Code:
Select ;
 Max(Emp_Key) As Emp_Key,;
 SSN,;
 Max(Emplast) As Emplast, ;
 Max(Updated) As 'updated' ;
  FROM EmplTest ;
  GROUP By SSN

which makes me think that there is something wrong with this data design. Probably you need to normalize this table.

Cetin Basoz
MS Foxpro MVP, MCP
 
I've been programming long enough to see that this was some kind of log table for edits.

emp_key was the PK
SSN the SSN that was edited /updated
emplast the employee ID of the person doing the edit
updated the date of the edit.

Getting the record with the max emp_key gives him the last record which shows him the user that did the last edit and when they did it.

this sql
Select ;
Max(Emp_Key) As Emp_Key,;
SSN,;
Max(Emplast) As Emplast, ;
Max(Updated) As 'updated' ;
FROM EmplTest ;
GROUP By SSN

would not have helped him as the MAX's won't always line up on the same record. Your 1st SQL with the sub select would generate the correct result. It would return the same result as the one I posted earlier that montypython1 said gives the required result.

Alan
 
I have been programming for some time either and that doesn't look like as you described to me:)
If it is such a log table:
-Where is the reference to the updated table(s) and to the primary key of that table? SSN might be a primary key but debatable if it is a bad design or not.
-What is the point of having last update as a date and not a datetime?
-Why would the results be based on max(empKey) and not max(update) per SSN?


Cetin Basoz
MS Foxpro MVP, MCP
 
You could write it to use the max update and based on real data that could be required. The OP would be the only one that would know I just followed what he was doing and wrote a query that would consistently work based on that need.

At this point its moot because he's got queries that get him his result.






Alan
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top