Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to join one to many accurately?

Status
Not open for further replies.

hanchilicious

Programmer
May 23, 2002
156
GB
Hi,

How do I join these two tables accurately?

I have three records in the FORECAST table:
Code:
Customer  Product  Wk04  Wk05  Wk06
-----------------------------------
TANDY     PLY001      4     3     2
ULYSSES   FEV999      5     2     1
NEWGUY    RUF101                  9
and I want to join them to the correct corresponding records in the SALES table:
Code:
Customer  Product  Area  Agent  Wk01  Wk02  Wk03
------------------------------------------------
TANDY     PLY001   GR31  10001    10    11    12
TANDY     PLY001   SP20  12302     0     0     0
ULYSSES   FEV999   BG27   9999     0     0     0
ULYSSES   FEV999   BG27  88882    15    15    15
OLDGUY    OLD001   ED11  10101     1     1     1
Here're the rules:
(a) I never want to assign any records from FORECAST to SALES where the Area = SP20.

(b) I have to support a FULL OUTER JOIN as there may be independent records in either FORECAST or SALES which need to be represented.

(c) In the case where there are multiple Agents per Customer and Product, only one of the Agents will have values in the SALES.WK* columns, and that is the row that should have the FORECAST row linked to it.


The end result should look something like this:
Code:
Customer  Product  Area  Agent  Wk01  Wk02  Wk03  Wk04  Wk05  Wk06
------------------------------------------------------------------
TANDY     PLY001   GR31  10001    10    11    12     4     3     2
TANDY     PLY001   SP20  12302     0     0     0  null  null  null
ULYSSES   FEV999   BG27   9999     0     0     0  null  null  null
ULYSSES   FEV999   BG27  88882    15    15    15     5     2     1
NEWGUY    RUF101   null  null   null  null  null  null  null     9
OLDGUY    OLD001   ED11  10101     1     1     1  null  null  null

I don't seem to be able to support all three rules at the same time without losing some records from the forecast table... I'm trying full outer joins with subqueries to alienate the agent I need, but either the subqueries complain that I'm returning more than one value, or I'm missing records with multiple agents or records that only exist in one table.

Can someone point me in the right direction?

Thanks a lot!
 
<off-topic>

If I, too, was "hanchilicious", what would I look like or be like?

</off-topic>

Could you please post the code that has brought you closest to the results you want?

Also, in cases where you receive an error that complains about more than one row returning from a query where only one should return, I have found that using a MAX() or MIN() function on the returning data forces a single-row result.

So, let us know what you have.

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I provide low-cost, remote Database Administration services: www.dasages.com]
 
Hiya Mufasa!

<off-topic>

I'm afraid that right now, the answer to that is: frazzled

</off-topic>

This is what I started with, in trying to insert the combined data into a target table:
Code:
FULL OUTER JOIN FORECAST
ON  (SALES.CUSTOMER = FORECAST.CUSTOMER) 
AND (SALES.PRODUCT = FORECAST.PRODUCT)
AND (SALES.AGENT = 
       (select max(agent) from sales 
        where ts is not null and ts <> 0
        and area <> 'SP20'
        group by customer,product,area))
-- *ts is a total wk to date column capturing all sales.wks
but I still get indistinct results errors.

Then, it evolved into something like this: (forgive me, I'm not at a terminal right now, but you can get the idea hopefully):
Code:
FROM SALES
FULL OUTER JOIN 
(select forecast.customer,
    forecast.product,
    forecast.area, --(a function derives and populates this field from SALES.area values)
    avg(wk01) as week01,
    avg(wk02) as week02,
    avg(wk03) as week03,
    avg(wk04) as week04,
    avg(wk05) as week05,
    avg(wk06) as week06,
    avg(wk07) as week07,
    avg(wk08) as week08,
    avg(wk09) as week09,
    avg(wk10) as week10,
    avg(wk11) as week11,
    avg(wk12) as week12,
    avg(wk13) as week13,
    avg(wk14) as week14,
    (select max(agent) from sales 
        where ts is not null and ts <> 0
        and area <> 'SP20'
        group by customer,product,area) as fake_agent
    from forecast
      full outer join sales on
          sales.product  = forecast.product and
          sales.customer = forecast.customer
    group by 
          forecast.customer,
          forecast.product,
          forecast.area) tableB
on sales.product = tableB.product and
sales.customer = tableB.customer and
sales.agent = tableB.fake_agent
I have the sneaky suspicion that this second way might be the equivalent of using a mallet to crack an egg. Plus, it doesn't seem to respect a proper full outer join as I end up missing rows from either table.

I know I should be doing something in the join to capture what I'm after, but I've no idea how to do it. :/
 
Hanchi, I'm certain that we can come up with solution for you. My distraction is that I'm just trying to get out the door for a 300-mile/4-hour drive from Salt Lake City to St. George, Utah, for my last daughter's wedding.

When I get to my destination, I shall put together code that should do what you want.

In the meantime, if other distinguished colleagues want to contribute, please have at it. I'll try to be back on-line at about 10 p.m. (Mountain Standard Time, GMT-7).

Cheers,

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[I provide low-cost, remote Database Administration services: www.dasages.com]
 
Thank you very much, Mufasa. Congratulations and good luck with the wedding!
 
Sorry, I was a bit too excitable with the [Submit Post] button just now...

I wanted to post additional details to my problem spec. Where I say:

"(c) In the case where there are multiple Agents per Customer and Product, only one of the Agents will have values in the SALES.WK* columns, and that is the row that should have the FORECAST row linked to it."

That's not turned out to be true for 100% of the cases. In a very few cases, one of the agents will have a very small quantity of sales, while the other agent will have a higher quantity. So, code which is driven on only one of the agents having values in the Wk* columns would not capture these scenarios. I guess I need to latch the FORECAST record to the max(sales.ts)* or highest sum of the same.

*(SALES.TS being the aggregation of all SALES.Wk* columns.)

Thanks again.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top