how to create distinct on 3 columns 1

varadha72 · Nov 10, 2014

Hi

I have 6 column data, of which I need to validate 3 columns to check if they are same in the subsequent rows having same ID, if Yes then the output should show it as one row only, if not display it as how it is.

Data-
ID...Name...Row...Exp...time...rowcount
1.....A......50....0.....17:00...1
1.....A......50....0.....17:01...2
2.....B......10....2.....17:02...1
2.....B......20....10....17:03...2
2.....B......20....10....17:04...3

output data:
ID...Name...Row...Exp...time...rowcount
1.....A......50....0.....17:00...1
2.....B......10....2.....17:02...1
2.....B......20....10.....17:03...2

In 1st line of output, since ID (not unique), name, row, and exp are same then I need to display only the first row where rowcount=1. (1.....A......50....0.....17:00...1)
In last 3 lines of output, since first two of last lines are not same (row and exp not same), then this line is displayed. (2.....B......10....2.....17:02...1)
But in last two lines of output, since ID , name, row, and exp are same then I need to display only the first row of these two rows where rowcount=2. (2.....B......20....10.....17:03...2)

I tried using a distinct query for the three columns, but because the time and rowcount were not distinct, it was returning all rows. Can someone help me here please.

djj55 · Nov 10, 2014

Look at something like:

Code:

SELECT ID, Name, Row, Exp, MIN(TIME)
FROM MYDATA 
GROUP BY ID, Name, Row, Exp

djj
The Lord is my shepherd (Psalm 23) - I need someone to lead me!

varadha72 · Nov 10, 2014

djj55, the above query is not working. Please note that I have to display the row_counter as well, and because of this the explicit rows are not being distinct. I tried something like this.

select srd.file_name, row_count, exception_count, row_number() over (partition by file_name order by run_date) as row_counter, file_name, row_count, exception_count,min(convert(char(5), run_date, 108)) time, run_date
from run_data srd
where run_date>=dateadd(day, datediff(day,0,getdate())-4,0)
and run_date<dateadd(day, datediff(day,0,getdate())-3,0) and result =0
and convert(char(5), run_date, 108)>'16:00'
group by srd.file_name,row_count, exception_count, run_date
order by srd.file_name,run_date asc

gk53 · Nov 10, 2014

for row_counter you need just

SELECT ID, Name, Row, Exp, MIN(TIME), count(*) as row_counter
FROM MYDATA
GROUP BY ID, Name, Row, Exp

varadha72 · Nov 10, 2014

Hi GK53, I tried with your query its not working.
-------------------------
select row_number() over (partition by file_name order by run_date) as row_counter, file_name, row_count, exception_count, MIN(convert(char(5), run_date, 108)) time, run_date, Count(*) as RowcounterCheck
FROM source_run_data
WHERE run_date>=dateadd(day, datediff(day,0,getdate())-5,0)
AND run_date<dateadd(day, datediff(day,0,getdate())-4,0) and result =0
AND convert(char(5), run_date, 108)>'16:00'
GROUP BY FILE_NAME, ROW_COUNT, EXCEPTION_COUNT, RUN_DATE
ORDER BY file_name,run_date ASC
-----------------------------------------
The output needs to show as below, but with above query I am being returned all rows. There is no change in the output.
NOTE: I need to check if row_counter=2, then check to see if (name, row, exp) in row 2 is equal to row 1, then display only first row, if not retain both rows.

ID...Name...Row...Exp...time...rowcount..datetime
1.....A......50....0.....17:00...1......11/06/2014 17:00:00
2.....B......10....2.....17:02...1......11/06/2014 17:02:00
2.....B......20....10.....17:03...2.....11/06/2014 17:03:00

gmmastros · Nov 10, 2014

The following query takes your sample data, loads it to a table variable, and then runs a query against it.

Code:

Declare @Temp Table(Id Int, Name VarChar(20), Row Int, [Exp] Int, Time DateTime, [RowCount] Int)

Insert Into @Temp Values(1,'A',50,0,'17:00',1)
Insert Into @Temp Values(1,'A',50,0,'17:01',2)
Insert Into @Temp Values(2,'B',10,2,'17:02',1)
Insert Into @Temp Values(2,'B',20,10,'17:03',2)
Insert Into @Temp Values(2,'B',20,10,'17:04',3)

; With Data As
(
  Select  *, 
          Row_Number() Over (Partition By Id, Name, Row, [exp] ORder By [RowCount]) As NewRowId
  From    @Temp
)
Select  *
From    Data
Where   NewRowId = 1

Please note that this query produces your expected results. I know this works with your sample data. If it does not work with your real data, please post more sample data that it does not work for.

-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

how to create distinct on 3 columns 1

varadha72

Programmer

djj55

Programmer

varadha72

Programmer

gk53

Programmer

varadha72

Programmer

gmmastros

Programmer

Similar threads

Part and Inventory Search

Sponsor