Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Eliminate rows with duplicate values in specific fields

Status
Not open for further replies.

wizard1923

Technical User
Dec 29, 2010
4
US
newbie with rudimentary knowledge of SQL queries here...trying to eliminate erroneous records with specific criteria. here is what i'm trying to accomplish:

select
update_datetime
,control_number
,record_type
from table1 inner join table2
on table1.field = table2.field
where update_datetime between 'x' and 'y'

results would look like:

update_datetime control_number record_type
2010-12-29-9.30.10 123456 1
2010-12-29-9.45.03 123456 2
2010-12-29-10.15.00 456789 1
2010-12-29-11.30.00 456789 1

what i need to do is:

1. if two records have the same control_number and one of them has a record_type of 2 - eliminate both records

2. if two records have the same control_number and both have the same record_type - keep only the record with latest update_datetime

Hope I've explained this thoroughly. Any help would be much appreciated.
 
SQL 2005+ solution:
Code:
;with cte as (select 
 update_datetime
,control_number
,record_type,
count(*) over (partition by Control_Number) as DupsCnt,
sum(case when Record_Type = 2 then 1 else 0 end) over (partition by Control_Number) as TypeCheck,
row_number() over (partition by Control_Number order by Update_Datetime Desc) as Row

from table1 inner join table2
on table1.field = table2.field
where update_datetime between 'x' and 'y')

select * from cte where DupsCnt = 1 or (DupsCnt > 1 and TypeCheck = 0 and Row = 1)


PluralSight Learning Library
 
Excellent! Thank you very much. However I forgot to include an additional clause in my question. There are numerous records that do not have a control_number or record_type and I want to keep all of these records. i.e.

update_datetime control_number record_type
2010-12-29-9.30.10 123456 1
2010-12-29-9.45.03 123456 2
2010-12-29-10.15.00 456789 1
2010-12-29-11.30.00 456789 1
2010-12-29-12.15.00
2010-12-29-12.30.00

results should eliminate those from the previous question but retain all records (in this case 2) without a control_number or record_type


Thank you again!
 
One possibility is to use the cte above with added condition
len(Control_Number) > 0 and Len(Record_Type) > 0 and then union all with the
Code:
select Fields from cte where ...
UNION ALL
select Fields from myTables where len(Control_Number) = 0 or len(Record_Type) = 0

PluralSight Learning Library
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top