Help on SQL Server - Selecting only the unique records

sivi · Apr 7, 2003

My query has 5 output fields, combination of 4 of these fields has duplicate entries in the output.

I need to select only the unique combination of these 4 fields.

PLease could someone help me with this, how to select only the unique ones, any help will be grately appreciated.

Many thanks
Sivi

TomKane · Apr 7, 2003

Maybe you could try to SELECT DISTINCT - I got this out of the online help:

Eliminating Duplicates with DISTINCT
The DISTINCT keyword eliminates duplicate rows from the results of a SELECT statement. If DISTINCT is not specified, all rows are returned, including duplicates. For example, if you select all the author IDs in titleauthor without DISTINCT, the following rows are returned (with some duplicate listings):

USE pubs

SELECT au_id

FROM titleauthor

Here is the result set:

au_id

-----------

172-32-1176

213-46-8915

213-46-8915

238-95-7766

267-41-2394

267-41-2394

274-80-9391

409-56-7008

427-17-2319

472-27-2349

486-29-1786

486-29-1786

648-92-1872

672-71-3249

712-45-1867

722-51-5454

724-80-9391

724-80-9391

756-30-7391

807-91-6654

846-92-7186

899-46-2035

899-46-2035

998-72-3567

998-72-3567

(25 row(s) affected)

With DISTINCT, you can eliminate duplicates and see only the unique author IDs:

USE pubs

SELECT DISTINCT au_id

FROM titleauthor

Here is the result set:

au_id

-----------

172-32-1176

213-46-8915

238-95-7766

267-41-2394

274-80-9391

409-56-7008

427-17-2319

472-27-2349

486-29-1786

648-92-1872

672-71-3249

712-45-1867

722-51-5454

724-80-9391

756-30-7391

807-91-6654

846-92-7186

899-46-2035

998-72-3567

(19 row(s) affected)

--------------------------------------------------------------------------------

Important The output for statements involving DISTINCT depends on the sort order chosen during installation and an ORDER BY clause. For information about the effects of different sort orders, see Sort Order.

--------------------------------------------------------------------------------

For the DISTINCT keyword, null values are considered to be duplicates of each other. When DISTINCT is included in a SELECT statement, only one NULL is returned in the results, regardless of how many null values are encountered.

--------------------------------------------------------------------------------

Note For compatibility with the SQL-92 standard and other implementations of Microsoft® SQL Server™, the ALL keyword can explicitly ask for all rows. However, there is no need to specify ALL because it is the default.

JamesLean · Apr 7, 2003

You can use the DISTINCT keyword to remove duplicates but it operates over the entire row - you would have to get rid of the column which was not duplicate.

If you do need that column, how do you know which value you want to keep? If it doesn't really matter you can use MAX or MIN on that column and group by the others:

Code:

SELECT col1, col2, col3, col4, MAX(col5)
FROM t1
GROUP BY col1, col2, col3, col4

--James

sugarthan · Apr 7, 2003

Go for the distinct fields.thats the solution

select distinct field1,field2,field3,field4,field5
from table1 ....

Sugarthan

JamesLean · Apr 7, 2003

sugarthan,
that won't work if the combination of all five columns is not unique. As I said, you would need to remove the non-unique column from the select list:

Code:

SELECT DISTINCT col1, col2, col3, col4
FROM t1

--James

sivi · Apr 7, 2003

Tom, Sugarthan, James
Many thanks for your prompt reply & help. I tried distinct, group by (with MAX()). But unfortunately I did not get the results I want.

I think its due to the complexity of my query. 3 columns out of 4 are formatted (compound) columns using fields from various tables. Here is my query.

I would appreciate if you could give some more help on this please.

Thanks
Sivi

set ansi_nulls off
select distinct
col1=c.an,
col2=dateadd(d,-day((cast(right(replicate('0',2)+convert(varchar(2),d.mID),2)+'/'+'01'+'/'+substring(b.cmonth,2,4) as datetime))),dateadd(m,1,(cast(right(replicate('0',2)+convert(varchar(2),d.mID),2)+'/'+'01'+'/'+substring(b.cmonth,2,4) as datetime) ))),
col3= case a.f when 'O' then b.rdate else ' ' end,
col4=convert(datetime, getdate(),112),
col5=case a.f when 'F' then b.rdate else ' ' end,

from cmast as b
join eq as a on b.ccode=a.ocode
join cli as c on b.ccode=c.clcod
join mcode as d on substring(b.cmonth,1,1)=d.contractmonthletter
where c.an is not NULL
and b.rdate>= dateadd(mm,-2,convert(char(8),getdate(),112))
and b.rdate<= dateadd(yy,1,convert(char(8),getdate(),112))

The results have to be unique on col 1,2,3 &5

JamesLean · Apr 7, 2003

In your query, Col4 is going to be the same for all rows anyway so the DISTINCT keyword should do what you want. Are you actually getting duplicate rows returned? --James

sivi · Apr 8, 2003

I had some test fields in the query. Thats why DISTINCT was not working. Its OK now.

Thanks again

Sivi

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Help on SQL Server - Selecting only the unique records

sivi

IS-IT--Management

TomKane

Programmer

JamesLean

Programmer

sugarthan

Programmer

JamesLean

Programmer

sivi

IS-IT--Management

JamesLean

Programmer

sivi

IS-IT--Management

Similar threads

Part and Inventory Search

Sponsor