Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Query problem 1

Status
Not open for further replies.

Jocsta

Programmer
Jun 14, 2002
204
GB
I am trying to duplicate a query from sqlserver which returns a single row from a one to many join.

E.g. (rough example) you have a table of people, this is linked to a table with their phone numbers (of which they may have many). The query needs to return all the people, and just one (doesnt matter which really for now) of their number (the list is intended to be expanded later - which is why we only want one row).

So people table :-
ID, NAME
1,tom
2,dick
3,harry
...
And the numbers :-
UID, OWNER, NUMBER
1, 1, 1234567
2, 1, 1231231
3, 2, 11223344
...

What I want is a query that gets
tom, 1234567
dick, 11223344
harry,
...


In SQLServer this has been done kinda like :-
SELECT People.Name, Phone.Number
FROM People
LEFT OUTER JOIN Phone on People.id=Phone.owner
and (Phone.UID=(select top 1 UID from Phone WHERE Owner=People.ID))

But I seem to be unable to do something similar in Oracle.
Suggestions welcome :)
Thanks,

<< JOC >>
 

Sice TOP function won't give you certain rows of results,
so it seems you don't care which phone number you get from the phone table in the case that one people has multiple numbers in the phone table. Then try this:

SELECT People.Name, Ph.Number
FROM People
LEFT OUTER JOIN (select owner, max(number) number from Phone group by owner ) ph
on People.id = ph.owner
 
Thanks for your suggestion, and you are correct in saying we dont care about the order, its just more an indication there is something - then the user is given an option to show everything as a more detailed version.

The problem is slightly more complicated than I let on in my made up example, and the data being returned from the joined table might not be just the one row - rather than phone, make it an address table with multiple columns, so its the row from the joined table that is important rather than an individual column.
 
akhhh.... I meant "the data being returned from the joined table might not be just the one COLUMN" .....
 
You have a couple of options:

Code:
select people.name, phone.number
from people
left outer join
     (select owner, number
         from phone p1
         where uid = 
         (select max(p2.uid) from phone p2
          where p2.owner = p1.owner)) phone
on People.id = ph.owner

select name, number
from 
(select name, number,
row_Number() over (partition by ph.owner order by ph.uid) as rn
from people
LEFT OUTER JOIN phone
on People.id = ph.owner)
where rn=1
 
Thanks for the reply - your first example works for me (I think your second is for sqlserver2005 tho? ).
It does seem slower than the sqlserver equivalent, but allows me to carry on working for now and benchmark it properly later.
I do feel pretty stupid now as its a fairly obvious answer, I just got kinda tied up trying to use rownum - doh.

Thanks again
 
The second example will work on Oracle Enterprise Edition, providing you have 9i or greater. I'm not sure what you mean by "sqlserver2005" - I assumed you were using Oracle.
 
Sorry for the missunderstanding ....
I am working on Oracle 10g at the moment (but in a multi database environment - originally sqlserver2000 tho).

Its just that earlier today I was working on SQLServer2005 looking at speeding up our paged results code, and came across the "new" Microsoft feature "row_Number() over (order by 1,2,...)" - it was the first time I saw it, and the "row_nember()" bit stuck in my mind, didnt notice the differences elsewhere....

Should have known Oracle probably had it first tho :D
 
Just as an FYI -
The performance of the suggested solutions was unfortunately unusable in our environment, so I had to give up on using the condition in the from clause, and had to put it in the where -

select people.name, phone.number
from people PE
left outer join phone PH on PE.id=PH.owner
where (PH.uid is null or PH.uid=
(select min(PH2.UID) from phone PH2 where PE.id=PH2.owner)

This was massively faster, and seems a bit old fashioned to me as I hate having to have the null check, but hey it works good, and thats what matters. New and shiny is not always best :)


<< JOC >>
 
I'm surprised there is much difference. I'd be interested in comparing the explain plans for the two queries.

Did you try the row_number version ? If you're on 10g, I would expect that to work.
 
Im kinda curious myself why such a difference, im not a database guy, just a programmer with enough SQL knowlege to get me into trouble (and a company too tight to hire a DBA, but doesnt worry about getting a fleet of BMW's for the sales team....grrrr) :)

Today, the other queries do seem to be returning faster, but are still significantly slower than the new one. The plans seem a lot more expensive to me too ( using the "bigger number = bad" school of optimizing :D ) and of course theres still the probability that I totally missunderstood you.


Anyway, I'll try and provide as much relevant info as possible......
The 3 tables involved are -

CREATE TABLE "SEARCH_RESULTS"
( "SEARCH_ID" NUMBER(16,0), "ENTITY_ID" NUMBER(16,0), "HIGHLIGHT" NUMBER(10,0),
"WEIGHTING" NUMBER(10,0), "AVAILABILITY" NUMBER(10,0)
)
CREATE UNIQUE INDEX "PK_SEARCH_RESULTS" ON "SEARCH_RESULTS" ("SEARCH_ID", "ENTITY_ID")

- this table holds all the results of previous searches .... only really interested in the first 2 columns here tho. It contains 3.5 million rows atm, and an average SEARCH_ID might have 10-10000 entity_id rows

CREATE TABLE "PROP_PERSON_GEN"
( "BISUNIQUEID" NUMBER(16,0), "REFERENCE" NUMBER(16,0), "TITLE" NUMBER(16,0),
"FIRST_NAME" NVARCHAR2(30), "MIDDLE_NAME" NVARCHAR2(30), "LAST_NAME" NVARCHAR2(30),
"FULLNAME" NVARCHAR2(100), "SALUTATION" NVARCHAR2(30), "PERSON_ID" NUMBER(16,0)
)
CREATE UNIQUE INDEX "SYS_C0014235" ON "PROP_PERSON_GEN" ("BISUNIQUEID")
CREATE INDEX "IX_PERSON_GEN_REFERENCE" ON "CONFIGDOMAIN1"."PROP_PERSON_GEN" ("REFERENCE")
CREATE INDEX "IX_PERSON_GEN_FIRST_NAME" ON "CONFIGDOMAIN1"."PROP_PERSON_GEN" ("FIRST_NAME")
CREATE INDEX "IX_PERSON_GEN_FULLNAME" ON "CONFIGDOMAIN1"."PROP_PERSON_GEN" ("FULLNAME")
CREATE INDEX "IX_PERSON_GEN_LAST_NAME" ON "CONFIGDOMAIN1"."PROP_PERSON_GEN" ("LAST_NAME")
CREATE INDEX "IX_PERSON_GEN_PERSON_ID" ON "CONFIGDOMAIN1"."PROP_PERSON_GEN" ("PERSON_ID")

- This holds name details for people. Currently holds 440,000 rows. bisuniqueid IS SET BY A TRIGGER/SEQUENCE


CREATE TABLE "PROP_ADDRESS"
( "BISUNIQUEID" NUMBER(16,0), "OCC_ID" NUMBER(16,0), "REFERENCE" NUMBER(16,0), "STREET1" NVARCHAR2(50),
"STREET2" NVARCHAR2(30), "LOCALITY" NVARCHAR2(30), "TOWN" NVARCHAR2(30), "COUNTY" NVARCHAR2(30),
"POST_CODE" NVARCHAR2(8), "COUNTRY" NUMBER(16,0), "MAP" NVARCHAR2(100)
)

CREATE UNIQUE INDEX "SYS_C0014196" ON "CONFIGDOMAIN1"."PROP_ADDRESS" ("BISUNIQUEID")
CREATE INDEX "IX_ADDRESS_REFERENCE" ON "CONFIGDOMAIN1"."PROP_ADDRESS" ("REFERENCE")

-- This holds addresses :) - currently has 500,000 rows

So thats the tables, all pretty basic.....



Here is the *actual* test query im using that works fast... page of results returned in 0.017 seconds :-

select * from ( select A.*, rownum rnum from (
SELECT "ROOT"."ENTITY_ID" AS "E01ID", PROP_PERSON_GEN.FULLNAME, PROP_ADDRESS.STREET1, PROP_ADDRESS.TOWN
FROM "SEARCH_RESULTS" "ROOT"
LEFT OUTER JOIN "PROP_PERSON_GEN" "PROP_PERSON_GEN" ON "ROOT"."ENTITY_ID"="PROP_PERSON_GEN"."REFERENCE"
LEFT OUTER JOIN PROP_ADDRESS ON "ROOT"."ENTITY_ID"="PROP_ADDRESS"."REFERENCE"
WHERE "ROOT"."SEARCH_ID"=10027
AND (PROP_ADDRESS.BISUNIQUEID is null OR PROP_ADDRESS.BISUNIQUEID =
(select min(A.BISUNIQUEID) from PROP_ADDRESS A WHERE A.REFERENCE=ROOT.ENTITY_ID))
order by E01ID
) A WHERE rownum < 20 ) WHERE rnum > 10


And the plan (CSV format - kinda new to this so am using SQLDeveloper to dump it)-

"Optimizer","Cost","Cardinality","Bytes","Partition Start","Partition Stop","Partition Id","ACCESS PREDICATES","FILTER PREDICATES"
"SELECT STATEMENT","ALL_ROWS","99","19","4028","","","","",""
"VIEW","","99","19","4028","","""RNUM"">10","","",""
"COUNT(STOPKEY)","","","","","","ROWNUM<20","","",""
"VIEW","","99","19","3781","","","","",""
"FILTER","","","","","","""PROP_ADDRESS"".""BISUNIQUEID"" IS NULL OR ""PROP_ADDRESS"".""BISUNIQUEID""= (SELECT /*+ */ MIN(""A"".""BISUNIQUEID"") FROM ""PROP_ADDRESS"" ""A"" WHERE ""A"".""REFERENCE""=:B1)","","",""
"NESTED LOOPS(OUTER)","","99","19","2508","","","","",""
"NESTED LOOPS(OUTER)","","42","19","1311","","","","",""
"INDEX(RANGE SCAN) CONFIGDOMAIN1.PK_SEARCH_RESULTS","ANALYZED","3","19","190","""ROOT"".""SEARCH_ID""=10027","","","",""
"TABLE ACCESS(BY INDEX ROWID) CONFIGDOMAIN1.PROP_PERSON_GEN","ANALYZED","3","1","59","","","","",""
"INDEX(RANGE SCAN) CONFIGDOMAIN1.IX_PERSON_GEN_REFERENCE","ANALYZED","2","1","","""ROOT"".""ENTITY_ID""=""PROP_PERSON_GEN"".""REFERENCE""(+)","","","",""
"TABLE ACCESS(BY INDEX ROWID) CONFIGDOMAIN1.PROP_ADDRESS","ANALYZED","3","1","63","","","","",""
"INDEX(RANGE SCAN) CONFIGDOMAIN1.IX_ADDRESS_REFERENCE","ANALYZED","2","1","","""ROOT"".""ENTITY_ID""=""PROP_ADDRESS"".""REFERENCE""(+)","","","",""
"SORT(AGGREGATE)","","","1","11","","","","",""
"TABLE ACCESS(BY INDEX ROWID) CONFIGDOMAIN1.PROP_ADDRESS","ANALYZED","4","1","11","","","","",""
"INDEX(RANGE SCAN) CONFIGDOMAIN1.IX_ADDRESS_REFERENCE","ANALYZED","3","1","","""A"".""REFERENCE""=:B1","","","",""




and the one without the analytic function - page of results returned in 2.82 seconds :-

select * from ( select A.*, rownum rnum from (
SELECT ROOT.ENTITY_ID AS E01ID, PROP_PERSON_GEN.FULLNAME, PROP_ADDRESS.STREET1, PROP_ADDRESS.TOWN
FROM SEARCH_RESULTS ROOT
LEFT OUTER JOIN PROP_PERSON_GEN PROP_PERSON_GEN ON ROOT.ENTITY_ID=PROP_PERSON_GEN.REFERENCE
LEFT OUTER JOIN (SELECT * from PROP_ADDRESS ADD1
WHERE BISUNIQUEID=(SELECT MAX(ADD2.BISUNIQUEID)
FROM PROP_ADDRESS ADD2 WHERE ADD1.REFERENCE=ADD2.REFERENCE)
) PROP_ADDRESS ON ROOT.ENTITY_ID=PROP_ADDRESS.REFERENCE

WHERE ROOT.SEARCH_ID=10027
order by E01ID
) A WHERE rownum < 20 ) WHERE rnum > 10

The plan :-

"Optimizer","Cost","Cardinality","Bytes","Partition Start","Partition Stop","Partition Id","ACCESS PREDICATES","FILTER PREDICATES"
"SELECT STATEMENT","ALL_ROWS","8730","19","4028","","","","",""
"VIEW","","8730","19","4028","","","","","""RNUM"">10"
"COUNT(STOPKEY)","","","","","","","","","ROWNUM<20"
"VIEW","","8730","19","3781","","","","",""
"MERGE JOIN(OUTER)","","8730","19","3021","","","","",""
"NESTED LOOPS(OUTER)","","42","19","1311","","","","",""
"INDEX(RANGE SCAN) CONFIGDOMAIN1.PK_SEARCH_RESULTS","ANALYZED","3","19","190","","","","""ROOT"".""SEARCH_ID""=10027",""
"TABLE ACCESS(BY INDEX ROWID) CONFIGDOMAIN1.PROP_PERSON_GEN","ANALYZED","3","1","59","","","","",""
"INDEX(RANGE SCAN) CONFIGDOMAIN1.IX_PERSON_GEN_REFERENCE","ANALYZED","2","1","","","","","""ROOT"".""ENTITY_ID""=""PROP_PERSON_GEN"".""REFERENCE""(+)",""
"SORT(JOIN)","","8688","1","90","","","","""ROOT"".""ENTITY_ID""=""PROP_ADDRESS"".""REFERENCE""(+)","""ROOT"".""ENTITY_ID""=""PROP_ADDRESS"".""REFERENCE""(+)"
"VIEW","","8688","1","90","","","","",""
"FILTER","","","","","","","","","""ADD1"".""BISUNIQUEID""=MAX(""ADD2"".""BISUNIQUEID"")"
"HASH(GROUP BY)","","8688","1","148","","","","",""
"HASH JOIN","","8613","501400","74207200","","","","""ADD1"".""REFERENCE""=""ADD2"".""REFERENCE""",""
"TABLE ACCESS(FULL) CONFIGDOMAIN1.PROP_ADDRESS","ANALYZED","2243","501400","5515400","","","","",""
"TABLE ACCESS(FULL) CONFIGDOMAIN1.PROP_ADDRESS","ANALYZED","2257","501400","68691800","","","","",""





and finally the query using the analytic function - page of results returned in 5.2 seconds :-

select * from ( select A.*, rownum rnum from (

SELECT "ROOT"."ENTITY_ID" AS "E01ID", PROP_PERSON_GEN.FULLNAME, PROP_ADDRESS.STREET1, PROP_ADDRESS.TOWN
FROM SEARCH_RESULTS ROOT

LEFT OUTER JOIN PROP_PERSON_GEN PROP_PERSON_GEN ON ROOT.ENTITY_ID=PROP_PERSON_GEN.REFERENCE
LEFT OUTER JOIN ( SELECT * FROM
( SELECT REFERENCE, BISUNIQUEID, STREET1, STREET2, LOCALITY, TOWN, COUNTY, POST_CODE, COUNTRY,
row_Number() over (partition by reference order by BISUNIQUEID) as rn
from PROP_ADDRESS ) WHERE rn < 2) PROP_ADDRESS
ON ROOT.ENTITY_ID=PROP_ADDRESS.REFERENCE

WHERE ROOT.SEARCH_ID=10027
order by E01ID

) A WHERE rownum < 20 ) WHERE rnum > 10


And the plan for that :-

"Optimizer","Cost","Cardinality","Bytes","Partition Start","Partition Stop","Partition Id","ACCESS PREDICATES","FILTER PREDICATES"
"SELECT STATEMENT","ALL_ROWS","30274","19","4028","","","","",""
"VIEW","","30274","19","4028","","","","","""RNUM"">10"
"COUNT(STOPKEY)","","","","","","","","","ROWNUM<20"
"VIEW","","30274","303550","60406450","","","","",""
"SORT(ORDER BY STOPKEY)","","30274","303550","52210600","","","","","ROWNUM<20"
"HASH JOIN(OUTER)","","18738","303550","52210600","","","","""ROOT"".""ENTITY_ID""=""from$_subquery$_007"".""REFERENCE""(+)",""
"HASH JOIN(OUTER)","","4894","303550","20944950","","","","""ROOT"".""ENTITY_ID""=""PROP_PERSON_GEN"".""REFERENCE""(+)",""
"INDEX(RANGE SCAN) CONFIGDOMAIN1.PK_SEARCH_RESULTS","ANALYZED","1200","303550","3035500","","","","""ROOT"".""SEARCH_ID""=10027",""
"TABLE ACCESS(FULL) CONFIGDOMAIN1.PROP_PERSON_GEN","ANALYZED","1909","430000","25370000","","","","",""
"VIEW","","9923","501400","51644200","","","","","""RN""(+)<2"
"WINDOW(SORT PUSHED RANK)","","9923","501400","31588200","","","","","ROW_NUMBER() OVER ( PARTITION BY ""REFERENCE"" ORDER BY ""BISUNIQUEID"")<2"
"TABLE ACCESS(FULL) CONFIGDOMAIN1.PROP_ADDRESS","ANALYZED","2255","501400","31588200","","","","",""


Enjoy :D

<< JOC >>
 
The format of your queries implies that you are mainly interested in getting the first few rows quickly rather than the execution time for the whole query. Queries involving OLAP functions would normally work better if you want the whole data set, because they involve a sort.

The second version of the query is doing a merge and a hash join. These are both operations which also involve sorts and aren't really therefore conducive to getting a quick response time for the first few rows.

The first version does entirely nested loops, which means that it grabs a row from the driving table and then uses indexes to find the matching rows on the other tables. This is generally a faster plan if you just want to see a few rows, as there is no sorting operation. However, if you were to execute the whole query from end to end, it's possible it could take longer than the other approaches.

You might be able to get the second query to run in a similar way to the first one by adding a hint such as:

Code:
select /*+ FIRST_ROWS(20) */ * from ( select A.*, rownum rnum from (
SELECT  "ROOT"."ENTITY_ID" AS "E01ID", PROP_PERSON_GEN.FULLNAME, PROP_ADDRESS.STREET1, PROP_ADDRESS.TOWN
  FROM SEARCH_RESULTS ROOT 

  LEFT OUTER JOIN PROP_PERSON_GEN PROP_PERSON_GEN ON ROOT.ENTITY_ID=PROP_PERSON_GEN.REFERENCE 
  LEFT OUTER JOIN ( SELECT * FROM 
                        ( SELECT REFERENCE, BISUNIQUEID, STREET1, STREET2, LOCALITY, TOWN, COUNTY, POST_CODE, COUNTRY, 
                          row_Number() over (partition by reference order by BISUNIQUEID) as rn
                         from PROP_ADDRESS ) WHERE rn < 2) PROP_ADDRESS
                          ON ROOT.ENTITY_ID=PROP_ADDRESS.REFERENCE

    WHERE ROOT.SEARCH_ID=10027
    order by E01ID 

  ) A WHERE rownum < 20 ) WHERE rnum > 10

or 

select * from ( select A.*, rownum rnum from (
    SELECT  /*+ USE_NL(ROOT PROP_PERSON_GEN PROP_ADDRESS */ "ROOT"."ENTITY_ID" AS "E01ID", PROP_PERSON_GEN.FULLNAME, PROP_ADDRESS.STREET1, PROP_ADDRESS.TOWN
    FROM "SEARCH_RESULTS" "ROOT" 
        LEFT OUTER JOIN "PROP_PERSON_GEN" "PROP_PERSON_GEN" ON "ROOT"."ENTITY_ID"="PROP_PERSON_GEN"."REFERENCE" 
        LEFT OUTER JOIN PROP_ADDRESS ON "ROOT"."ENTITY_ID"="PROP_ADDRESS"."REFERENCE"
    WHERE "ROOT"."SEARCH_ID"=10027
        AND (PROP_ADDRESS.BISUNIQUEID is null OR PROP_ADDRESS.BISUNIQUEID = 
            (select min(A.BISUNIQUEID) from PROP_ADDRESS A WHERE A.REFERENCE=ROOT.ENTITY_ID))
    order by E01ID 
  ) A WHERE rownum < 20 ) WHERE rnum > 10
 
The hints didnt seem to make any noticable difference to the overally time of the query, and in a way im glad, as none of these queries will be "hard coded" and are all completely dynamic, so a "pure" query relying on the optimizer would be prefered to a hand tuned one.
The one area I hadnt explored just yet was sorting the rows returned (which is a requirement) this has slowed things down quite a bit ( using ORDER BY PROP_PERSON_GEN.FULLNAME, PROP_ADDRESS.STREET1,E01ID rather then just E01ID ) - but its still under 2 seconds at the moment on my desktop machine, so thats no great worry atm)

Once again tho, thanks for your help, time and knowledge

<< JOC >>
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top