Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Help Joining 2 tables on a distinct query

Status
Not open for further replies.

bmacbmac

IS-IT--Management
Jan 26, 2006
392
US
Hi... I am trying to select rows from tblpagerange based on a distinct query in tbldocumentinfo using the following statement:

SELECT tblpagerange.*
FROM tblPageRange
join tbldocumentinfo
on tbldocumentinfo.imageid = tblpagerange.imageid
where tbldocumentinfo.sdocumentnosort = (select distinct sdocumentnosort from tbldocumentinfo where sdocumentnosort between '6587490' and '6587491')

I get the following error:
Server: Msg 512, Level 16, State 1, Line 1
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.


I think my query is close, but something just isn't correct in my 'where' statement. Can anyone help me narrow this down?
 
Code:
SELECT tblpagerange.*
       FROM tblPageRange
       join tbldocumentinfo on tbldocumentinfo.imageid =
                               tblpagerange.imageid
where tbldocumentinfo.sdocumentnosort between '6587490' and '6587491'

No need of subquery in Where clause, you will get exact the same records.

Borislav Borissov
VFP9 SP1, SQL Server 2000/2005.
 
Just as an idea, change:

where tbldocumentinfo.sdocumentnosort = (select distinct sdocumentnosort from tbldocumentinfo where sdocumentnosort between '6587490' and '6587491')

To:

where tbldocumentinfo.sdocumentnosort IN (select distinct sdocumentnosort from tbldocumentinfo where sdocumentnosort between '6587490' and '6587491')

I'm not sure if that is what you want, but you can't use the = sign when the subquery returns more than 1 result
 
Thank you Borislav... That statement will work, however, I need only the Distinct entries. Many of the sdocumentnosort entries are duplicated.
 
Gixonita - I tried using IN, but it gave me all entries.. not the distinct ones.
 
I prefer using Group By clauses instead of Distinct, they tend to be more efficient and allow me more control over what data is actually being "thrown out"

Try the following:

Code:
SELECT tblpagerange.*
FROM tblPageRange
JOIN tbldocumentinfo
ON tbldocumentinfo.imageid = tblpagerange.imageid
where tbldocumentinfo.sdocumentnosort in 
(SELECT sdocumentnosort FROM tbldocumentinfo WHERE sdocumentnosort BETWEEN '6587490' AND'6587491' GROUP BY sdocumentnosort)

or an even better way, assuming you have indexes is to do what borislav suggested, but to remove the duplicates, use the GROUP BY and don't use "*" to select all.
 
oh, and looking at the query i just noticed that you are passing numbers as strings?

Code:
'6587490' AND '6587491'

is sdocumentnosort really string data types that you are trying to evaluate as numeric/int? or are they numeric/int datatypes in the database? If so you don't need the single quotes around them.
 
Thanks Jhall01.. That is a good idea, however the query still returns ALL entries. There are about 150,000 rows in my database, but only 84,000 are distinct sdocumentnosort.

My "*" is to select all columns in tblpagerange for the 84,000 distinct tbldocumentinfo.sdocumentnosort entries.

If I just run a select in tbldocumentinfo:

select distinct(sdocumentnosort) from tbldocumentinfo
where sdocumentnosort in (select distinct sdocumentnosort from tbldocumentinfo where sdocumentnosort between '6587490' and '6587491')

The results are great.. 84,000 rows. I just can't find a way to incorporate that into a JOIN with another table.
 
This query ALWAYS will returns ALL results.
Let's say you have this table:
Code:
MyTable
Fld1  Fld2    Fld3 UniqFld
int   int     int    int
-----------------------
1      1       1     1
1      1       1     2
1      1       1     3
1      1       1     4
2      1       1     5
2      1       1     6
3      1       1     7
3      1       1     8

Here your query:
Code:
SELECT * FROM MyTable WHERE Fld1 IN (SELECT DISTINCT Fld1 FROM MyTable WHERE Fld1 BETWEEN 1 AND 2)
--- That means select all records that have Fld1 = 1 or 2
--- BUT ALL Records have Fld1 1 or 2 (except last 2)

My query:
Code:
SELECT * FROM MyTable WHERE Fld1 WHERE Fld1 BETWEEN 1 AND 2

--- That means select all records that have Fld1 = 1 or 2
--- wich is the same as your, except that it not use subquery :-)

If you want disctinct values, youmust first deside wich fields you need and than do something like:
Code:
SELECT DISTINCT Fld1, Fld2, Fld3 FROM MyTable WHERE Fld1 BETWEN 1 AND 2

That will returns you two records:
1 1 1
2 1 1

But If you include the last field
Code:
SELECT DISTINCT Fld1, Fld2, Fld3, Fld4 FROM MyTable WHERE Fld1 BETWEN 1 AND 2

that will returns you again all records from 1 to 6

So, first you must deside wich fields you need and then create query to get only these fields with no duplicate rows based on that fields.



Borislav Borissov
VFP9 SP1, SQL Server 2000/2005.
 
ok.. I think I'm getting it. So how do I handle this scenario:

my tbldocumentinfo has 6 columns:
irecordid
sdocumentno
sbookno
sdocumentnosort
sbooknosort
imageid

I need my statement results in the order of these columns and want my distinct function based ONLY on the sdocumentnosort.

if i do:

select distinct irecordid, sdocumentno, sbookno, sodocumentnosort, sbooknosort, imageid
from tbldocumentinfo
where sdocumentnosort between '6587490' AND'6587491'

I will get ALL records because each irecordid is unique.


if i do:

select distinct(sdocumentnosort), irecordid, sdocumentno, sbookno, sbooknosort, imageid
from tbldocumentinfo
where sdocumentnosort between '6587490' AND'6587491'

The correct results come up, but the columns are not in the right places.

Is there a way to get
select irecordid, sdocumentno, sbookno, distinct(sdocumentnosort), sbooknosort, imageid?

I think this is the original reason I was trying to use DISTINCT in my WHERE statement.
 
If you didn't care about values of other columns, you could use:
Code:
SELECT MAX(irecordid)   AS irecordid,
       MAX(sdocumentno) AS sdocumentno,
       MAX(sbookno)     AS sbookno,
       sdocumentnosort
       MAX(sbooknosort) AS sbooknosort,
       MAX(imageid)     AS imageid
FROM MyTable
GROUP BY sdocumentnosort

BUT if you want some Full records you must decide wich ones. I don't know how to help, this is only theoretical conversatrion :) If you post some data and show what result you want from that data, I am sure somebody could help you.



Borislav Borissov
VFP9 SP1, SQL Server 2000/2005.
 
Good idea bborissove. Yes, unfortunately I will care about the values in other columns as they tie to other tables. Not only that, but the query gave me about 350,000 hits.

As a data example, If I run the following:

Code:
select irecordid, sdocumentno, sbookno, sdocumentnosort, sbooknosort  
from tbldocumentinfo
where sdocumentnosort between '6566148' and '6566155'

I get the following results:

354536 566148 1197 6566148 41197
354537 566149 1197 6566149 41197
354538 566149 1197 6566149 41197
354539 566150 1197 6566150 41197
354540 566150 1197 6566150 41197
620806 566151 1197 6566151 41197
437263 566152 1197 6566152 41197
437270 566153 1197 6566153 41197
437251 566154 1197 6566154 41197
437253 566155 1197 6566155 41197

Documentnosort is duplicated a few times. I would like these exact results, but with a distinct sdocumentnosort. For example:

354536 566148 1197 6566148 41197
354538 566149 1197 6566149 41197
354540 566150 1197 6566150 41197
620806 566151 1197 6566151 41197
437263 566152 1197 6566152 41197
437270 566153 1197 6566153 41197
437251 566154 1197 6566154 41197
437253 566155 1197 6566155 41197
 
Code:
DECLARE @cTest TABLE (irecordid    int,
                      sdocumentno  int,
                      sbookno      int,
                      sdocumentnosort int,
                      sbooknosort int)
INSERT INTO @cTest VALUES(354536,566148,1197,6566148,41197)
INSERT INTO @cTest VALUES(354537,566149,1197,6566149,41197)
INSERT INTO @cTest VALUES(354538,566149,1197,6566149,41197)
INSERT INTO @cTest VALUES(354539,566150,1197,6566150,41197)
INSERT INTO @cTest VALUES(354540,566150,1197,6566150,41197)
INSERT INTO @cTest VALUES(620806,566151,1197,6566151,41197)
INSERT INTO @cTest VALUES(437263,566152,1197,6566152,41197)
INSERT INTO @cTest VALUES(437270,566153,1197,6566153,41197)
INSERT INTO @cTest VALUES(437251,566154,1197,6566154,41197)
INSERT INTO @cTest VALUES(437253,566155,1197,6566155,41197)

SELECT cTest.*
       FROM @cTest cTest
INNER JOIN (SELECT MAX(irecordid) AS irecordid,
                   sdocumentnosort
            FROM @cTest
                 GROUP BY sdocumentnosort) cTest1
ON cTest.irecordid = cTest1.irecordid

Borislav Borissov
VFP9 SP1, SQL Server 2000/2005.
 
Thanks Borislav. I have about 150,000 rows of data so manually inserting values will not be an option.
 
bmacbmac,

I think you missed one subtle (but important) ascpect of Borislav's most recent post. You see... he took that sample data you posted and inserted it in to a table variable (@cTest). Then he wrote a query that uses the sample data.

You should copy/paste the code the he presented in to Query Analyzer. Smash the 'F5' to test it. If you get the results you expect based on the sample data, then you should remove the insert stuff, replace @cTest with your real table name, and then mash the 'F5' key again.

With any luck, you'll get the results you were expecting.

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 


Just a question for jhall01:

>I prefer using Group By clauses instead of Distinct, they tend to be more efficient

Is this really true for ms sql server?!

I always experienced in other dbmss that "Group By" is much much(!) slower than "Distinct" when just wanting to filter out duplicates.
And just to mention, when using "Distinct" with an "inner join" with a "1 to n" relation, using "Exists" is much faster than using "Distinct".

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top