Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Westi on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Select Distinct still returning duplicates... 1

Status
Not open for further replies.

theharmonyguy

Programmer
Jul 13, 2004
7
US
OK, this one has me stumped.

Here's the data I'm dealing with. A Release table contains various releases with job numbers. This table has a one-to-many with a Drawings table; each drawing is associated with one release. Consequently there's a Bill of Materials table, and every part in the BOM table is also associated with one drawing (one-to-many again).

I'm trying to pull all of the parts (BOM table) that are associated with a particular job (i.e. parts associated with drawings that are associated with releases that have a given job number).

To do so I have a stored procedure that takes in various inputs, mainly the job number. Here is the gist of the original stored procedure SQL:

SELECT
SUBSTRING(Dwg.fldChild,1,CASE WHEN SUBSTRING(Dwg.fldChild,1,1) = 'S' THEN 10 ELSE 7 END) fldDrawing,
Dwg.fldChildExtension,
BOM.fldPart,
BOM.fldQuantity,
BOM.fldUnitOfMeasure,
BOM.fldDescriptionOne,
BOM.fldDescriptionTwo,
BOM.fldTask,
BOM.fldReplacementCost,
BOM.fldReplacementUOM
FROM Releases, BillsOfMaterial AS BOM, Drawings AS Dwg
WHERE
BOM.fldDrawingPK = Dwg.fldDrawingPK
AND
Release.fldReleasePK = Dwg.fldReleasePK
AND
Release.fldJobNum = @prmJobNum

Problem is, there are lots of parts (BOM table) that are identical in every field except for their primary key. That is, there are two releases with the same job number, each associated with two different drawings, each of which are associated with two different parts - but the parts differ only in the primary key, nothing else.

I figured if I tacked on a DISTINCT, I would get only get one record for such duplicated parts, since the primary key is not actually part of the FROM clause. But alas, it doesn't work.

Then I realized the first field in the FROM clause is generated based on data from the Drawings table. Now in actuality this ends up being the same for such parts (the referenced data depends on the job, not the individual drawing - I know, I didn't write the DB structure :) ). So I tried doing a SELECT DISTINCT * FROM the above SELECT statement.

No deal. Every time I end up with a recordset that contains duplicate rows. None of the fields in these rows are different in any way; I've done an export to CSV from the recordset to check things over.

I've tried replacing the joins with subqueries (WHERE fldDrawingPL IN / NOT IN), but I keep ending up with timeouts. I just can't seem to get my head around what SQL is needed here, and I don't understand why my attempts at DISTINCT have failed. I even tried creating a second stored procedure which loaded the results of this stored procedure into a temporary table then did a SELECT DISTINCT * on that temp table, and I still ended up with duplicates.

I'm clueless. Any ideas?
 
> I figured if I tacked on a DISTINCT, I would get only get one record for such duplicated parts, since the primary key is not actually part of the FROM clause. But alas, it doesn't work.

So... above query with DISTINCT returns some dupes. Are you absolutely sure? All column values are indentical?

------
[small]select stuff(stuff(replicate('<P> <B> ', 14), 109, 0, '<.'), 112, 0, '/')[/small]
[banghead]
 
Can you provide us some examples of your data?

-SQLBill

Posting advice: FAQ481-4875
 
Well, now that you ask... I found the problem. :) I had been looking at the CSV export in Excel, so the "duplicate" rows appeared to be identical, when in fact one would have a NULL for one particular field, while the supposed dupe had a blank space for that field. (This data needs some cleanup.)

Anyway, thanks for the help!
 
btw, for any other users that encounter this problem, here was my band-aid solution (ideally the data should be cleaned up, but that's beyond my control at the moment). Instead of SELECTing BOM.fldReplacementUOM (which was the field that had the blanks and nulls), I selected:

(CASE WHEN BOM.fldReplacementUOM = ' ' THEN NULL ELSE BOM.fldReplacementUOM END) fldReplacementUOM

Works like a charm!
 
Look into using ISNULL.

ISNULL('field', 'replacement_value')

-SQLBill

Posting advice: FAQ481-4875
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top