Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Duplicates when matching 1

Status
Not open for further replies.

tshad

Programmer
Jul 15, 2004
386
US
I have a couple of tables that I am trying to join together and I get duplicates, which I understand.

I am importing information (staging) that I use to add a PO to the shipping table (ship). Both tables have the same shipping number that I am joining on. I also use the quantities from both tables to determine which record the PO goes to.

If the quantities are the same, it doesn't matter which record the PO goes on as long as each record has a different PO on it.

I have 6 rows in each file and expect to get 6 rows back but I get 14. If there are no duplicate quantities it work correctly.

Here is the schema and query.

Code:
If OBJECT_ID('tempdb..#Stage') IS NOT NULL
	DROP TABLE #Stage


If OBJECT_ID('tempdb..#Ship') IS NOT NULL
	DROP TABLE #Ship


CREATE TABLE #Stage
(
	StageID int,
	ShipNumber varchar(10),
	PONumber varchar(10),
	iQty int
)

CREATE TABLE #Ship
(
	ShipID int,
	ShipmentNumber varchar(10),
	Units int
)

INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (20, 'MZ12345', 'A4735', 10)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (22, 'MZ12345', 'B7236', 25)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (23, 'MZ12345', 'L4432', 10)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (35, 'MZ12345', 'J3321', 40)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (36, 'MZ12345', 'U8911', 25)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (32, 'MZ12345', 'V3199', 10)

Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (250, 'MZ12345', 40)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (251, 'MZ12345', 10)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (252, 'MZ12345', 25)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (253, 'MZ12345', 25)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (255, 'MZ12345', 10)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (300, 'MZ12345', 10)

SELECT StageID, ShipID, PONumber, iQty, Units
FROM #Stage st
JOIN #Ship s
ON st.ShipNumber = s.ShipmentNumber
AND st.iQty = s.Units

The results are:

Code:
StageID	ShipID	PONumber	iQty	Units
35	250	J3321	40	40
20	251	A4735	10	10
23	251	L4432	10	10
32	251	V3199	10	10
22	252	B7236	25	25
36	252	U8911	25	25
22	253	B7236	25	25
36	253	U8911	25	25
20	255	A4735	10	10
23	255	L4432	10	10
32	255	V3199	10	10
20	300	A4735	10	10
23	300	L4432	10	10
32	300	V3199	10	10

Anyone know how I could change this query to get only 6 rows with the POs in the right place.

Thanks,

Tom
 
Maybe do something with ROW_NUMBER() to assign unique numbers to the rows where shipment number and units are the same, and then join on that, too. I'm thinking something like this:

Code:
With StageNumbered AS 
  (SELECT StageID, ShipNumber, PONumber, iQty,
          RECORD_NUMBER OVER (PARTITION BY PONumber, iQty) AS RNum
     FROM #Stage), 

ShipNumbered AS 
   (SELECT ShipID, ShipmentNumber, Units, 
           RECORD_NUMBER OVER (PARTITION BY ShipmentNumber, Units) AS RNum)

SELECT StageID, ShipID, PONumber, iQty, Units
FROM StageNumbered st
JOIN ShipNumbered s
ON st.ShipNumber = s.ShipmentNumber
AND st.iQty = s.Units 
AND st.RNum = s.RNum

Untested, but should give the idea.

Tamar
 
I tried that on my 2005 machine and got the errors:

Msg 156, Level 15, State 1, Line 46
Incorrect syntax near the keyword 'OVER'.
Msg 156, Level 15, State 1, Line 51
Incorrect syntax near the keyword 'OVER'.


I thought maybe it wouldn't work on 2005, but it also didn't work on 2008R2.

Can you see what might be off on the query?

Thanks,

Tom
 
Did you mean ROW_NUMBER() instead of Record number?

I changed it to also have a from clause that I believe was missing.

It also needs an order by if ROW_NUMBER is used.

Here is what I changed it to:

Code:
If OBJECT_ID('tempdb..#Stage') IS NOT NULL
	DROP TABLE #Stage


If OBJECT_ID('tempdb..#Ship') IS NOT NULL
	DROP TABLE #Ship


CREATE TABLE #Stage
(
	StageID int,
	ShipNumber varchar(10),
	PONumber varchar(10),
	iQty int
)

CREATE TABLE #Ship
(
	ShipID int,
	ShipmentNumber varchar(10),
	Units int
)

INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (20, 'MZ12345', 'A4735', 10)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (22, 'MZ12345', 'B7236', 25)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (23, 'MZ12345', 'L4432', 10)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (35, 'MZ12345', 'J3321', 40)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (36, 'MZ12345', 'U8911', 25)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (32, 'MZ12345', 'V3199', 10)

Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (250, 'MZ12345', 40)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (251, 'MZ12345', 10)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (252, 'MZ12345', 25)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (253, 'MZ12345', 40)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (255, 'MZ12345', 10)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (300, 'MZ12345', 10)

SELECT StageID, ShipID, PONumber, iQty, Units
FROM #Stage st
JOIN #Ship s
ON st.ShipNumber = s.ShipmentNumber
AND st.iQty = s.Units

;With StageNumbered AS 
  (SELECT StageID, ShipNumber, PONumber, iQty,
          ROW_NUMBER() OVER (PARTITION BY PONumber, iQty) AS RNum
     FROM #Stage), 

ShipNumbered AS 
   (SELECT ShipID, ShipmentNumber, Units, 
           ROW_NUMBER() OVER (PARTITION BY ShipmentNumber, Units) AS RNum FROM #Ship)

SELECT StageID, ShipID, PONumber, iQty, Units
FROM StageNumbered st
JOIN ShipNumbered s
ON st.ShipNumber = s.ShipmentNumber
AND st.iQty = s.Units 
AND st.RNum = s.RNum

What would I change the partition to and what would go with the Order by clause?

Thanks,

Tom
 
Got it and it seemed to work correctly.

Not sure why. I am not sure why I need to have the order by, but that is necessary with the ROW_NUMBER.

I am not sure why the ROW_NUMBER makes it work, but it does.

Here is the final query (not sure if it matters what is in the ORDER BY, but I used the ShipNumber and ShipmentNumber)

Code:
If OBJECT_ID('tempdb..#Stage') IS NOT NULL
	DROP TABLE #Stage


If OBJECT_ID('tempdb..#Ship') IS NOT NULL
	DROP TABLE #Ship


CREATE TABLE #Stage
(
	StageID int,
	ShipNumber varchar(10),
	PONumber varchar(10),
	iQty int
)

CREATE TABLE #Ship
(
	ShipID int,
	ShipmentNumber varchar(10),
	Units int
)

INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (20, 'MZ12345', 'A4735', 10)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (22, 'MZ12345', 'B7236', 25)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (23, 'MZ12345', 'L4432', 10)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (35, 'MZ12345', 'J3321', 40)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (36, 'MZ12345', 'U8911', 25)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (32, 'MZ12345', 'V3199', 10)

Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (250, 'MZ12345', 40)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (251, 'MZ12345', 10)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (252, 'MZ12345', 25)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (253, 'MZ12345', 40)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (255, 'MZ12345', 10)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (300, 'MZ12345', 10)

SELECT StageID, ShipID, PONumber, iQty, Units
FROM #Stage st
JOIN #Ship s
ON st.ShipNumber = s.ShipmentNumber
AND st.iQty = s.Units

;With StageNumbered AS 
  (SELECT StageID, ShipNumber, PONumber, iQty,
          ROW_NUMBER() OVER (PARTITION BY PONumber, iQty ORDER BY ShipNumber) AS RNum
     FROM #Stage), 

ShipNumbered AS 
   (SELECT ShipID, ShipmentNumber, Units, 
           ROW_NUMBER() OVER (PARTITION BY ShipmentNumber, Units ORDER BY ShipmentNumber) AS RNum FROM #Ship)

SELECT StageID, ShipID, PONumber, iQty, Units
FROM StageNumbered st
JOIN ShipNumbered s
ON st.ShipNumber = s.ShipmentNumber
AND st.iQty = s.Units 
AND st.RNum = s.RNum

Thanks,

Tom
 
OK.

I printed out the results of the two CTE tables that are being joined:

StageNumbered
Code:
StageID	ShipNumber	PONumber	iQty	RNum
20	MZ12345	        A4735	        10	1
22	MZ12345	        B7236	        25	1
35	MZ12345	        J3321	        40	1
23	MZ12345	        L4432	        10	1
36	MZ12345	        U8911	        25	1
32	MZ12345	        V3199	        10	1

ShipNumbered
Code:
ShipID	ShipmentNumber	Units	RNum
251	MZ12345	        10	1
255	MZ12345	        10	2
300	MZ12345	        10	3
252	MZ12345	        25	1
253	MZ12345	        40	1
250	MZ12345	        40	2

What I don't understand is that since the StageNumberd.RNum is always 1, I would assume that only 3 would be selected as there is no 2 or 3 in the StageNumbered table.

Also, how does the SELECT know to put different PONumbers where the quantities are the same.

Thanks,

Tom
 
Slight issue. Just noticed it. It almost works. The results are:

Code:
StageID	ShipID	PONumber	iQty	Units
20	251	A4735	        10	10
23	251	L4432	        10	10
32	251	V3199	        10	10
36	252	U8911	        25	25
22	252	B7236	        25	25
35	253	J3321	        40	40

As you can see the PONumbers are correct but the ship ID's are not.

I tried sticking the ShipID in the Partition but that just gave me 13 rows again. How would I fix that column?

Thanks,

Tom
 
First, sorry for giving you Record_number rather than Row_number. Looks like the remaining problem is also a mistake I made. In the first CTE, you need ShipNumber, not PONumber. Here's the code that works for me:

Code:
If OBJECT_ID('tempdb..#Stage') IS NOT NULL
	DROP TABLE #Stage


If OBJECT_ID('tempdb..#Ship') IS NOT NULL
	DROP TABLE #Ship


CREATE TABLE #Stage
(
	StageID int,
	ShipNumber varchar(10),
	PONumber varchar(10),
	iQty int
)

CREATE TABLE #Ship
(
	ShipID int,
	ShipmentNumber varchar(10),
	Units int
)

INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (20, 'MZ12345', 'A4735', 10)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (22, 'MZ12345', 'B7236', 25)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (23, 'MZ12345', 'L4432', 10)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (35, 'MZ12345', 'J3321', 40)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (36, 'MZ12345', 'U8911', 25)
INSERT #Stage (StageID, ShipNumber, PONumber, iQty) VALUES (32, 'MZ12345', 'V3199', 10)

Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (250, 'MZ12345', 40)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (251, 'MZ12345', 10)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (252, 'MZ12345', 25)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (253, 'MZ12345', 40)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (255, 'MZ12345', 10)
Insert #Ship (ShipID, ShipmentNumber, Units) VALUES (300, 'MZ12345', 10)

--SELECT StageID, ShipID, PONumber, iQty, Units
--FROM #Stage st
--JOIN #Ship s
--ON st.ShipNumber = s.ShipmentNumber
--AND st.iQty = s.Units

;With StageNumbered AS 
  (SELECT StageID, ShipNumber, PONumber, iQty,
          ROW_NUMBER() OVER (PARTITION BY ShipNumber, iQty ORDER BY ShipNumber) AS RNum
     FROM #Stage), 

ShipNumbered AS 
   (SELECT ShipID, ShipmentNumber, Units, 
           ROW_NUMBER() OVER (PARTITION BY ShipmentNumber, Units ORDER BY ShipmentNumber) AS RNum FROM #Ship)

SELECT StageID, ShipID, PONumber, iQty, Units
FROM StageNumbered st
JOIN ShipNumbered s
ON st.ShipNumber = s.ShipmentNumber
AND st.iQty = s.Units 
AND st.RNum = s.RNum

Tamar
 
That worked and solved my problem.

This was just an example of what I was trying to do. I can use this to solve my issues where there is another field that contains a reference number. The problem is that the reference number is coming from an import file where the reference number is one of 3 different types of numbers that I have to keep separate. But this should work by creating 2 other CTE's such that I have three different StageNumbered CTE's or I could combine them in the first above CTE (not sure yet).

But the issue was the duplicates and that seems to be solved here.

Thanks,

Tom
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top