Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to avoid duplicate rows with a joined table 1

Status
Not open for further replies.

fsteeman

Programmer
Jul 17, 2002
103
DK
Hello all,

This is one of these things that I am uncertain can be solved with ordinary SQL, but any suggestions are appreciated!


I have two tables with a one-to-many relationship. One record in the main table can be linked with none, one or more records in the secondary table.

For most main records there are no corresponding secondary records, for a few there is one, and for one in particular, there are two. This latter record is what it is all about.

In my query I have a LEFT JOIN which (naturally) duplicates the record in the main table in order to include both records from the secondary table.

However, I only want a single record from the main table and a value from the first record in the secondary table. Duplicate main table rows give me problems in my overlying application when enforcing constraints.

How do I accomplish this if it is at all possible? I have depleted the possibilities with different kinds of JOINs. I wonder whether it can be solved with a GROUP BY statement.







Fedor Steeman
Geological Museum Copenhagen
Denmark
 
It depends what you mean with First record, which record in secondary table is first?, If you don't bother about what values you want from secondary table you could use MAX() function for all secondary table fields and GROUP BY that query. But the information you get with such query is total mess, because you will never know from what record the value of the field is. If you post some example data and desired result I think we could figured out what query you need :).
(I ask for that because I don't know how you define FIRST record)

Borislav Borissov
VFP9 SP1, SQL Server 2000/2005.
MVP VFP
 
Hi Borislav,

Well, I cannot really show any sample data, because in reality there are many more tables involved in a complex relationship. It is only the relationship between two tables in particular that yields problems.

With FIRST record I just mean any record from the secondary table. So it does not really matter, just the first one in sequence depending on the sorting.

I understand what you are saying with the 'mess', but for the purposes of the involved query it does not really matter at all from what secondary record the field value comes. Just that it is indicated that there is at least one corresponding record in the secondary table.





Fedor Steeman
Geological Museum Copenhagen
Denmark
 
What I meant with mess is that you could get values from different records if you use MAX() function.
Here an example how you could get some record of the secondary table:
Code:
DECLARE @Table1 TABLE (Id int, SomeField varchar(200))
DECLARE @Table2 TABLE (Id int, FK int, SomeField varchar(200))
INSERT INTO @Table1 VALUES (1, 'aaaaaaaaaaa')
INSERT INTO @Table1 VALUES (2, 'bbbbbbbbbbb')

INSERT INTO @Table2 VALUES (1, 1, 'cccccccccccc')
INSERT INTO @Table2 VALUES (2, 1, 'dddddddddddd')
INSERT INTO @Table2 VALUES (3, 1, 'eeeeeeeeeeee')

SELECT *
       FROM @Table1 Tbl1
LEFT JOIN @Table2 Tbl2 ON Tbl1.Id = Tbl2.Fk
--- With the SELECT above you will get the result with
--- duplicate rows
SELECT *
       FROM @Table1 Tbl1
LEFT JOIN (SELECT Tbl3.* FROM @Table2 Tbl3
                    INNER JOIN (SELECT MAX(Id) AS Id, FK
                                       FROM @Table2
                                       GROUP BY Fk) Tbl
                    ON Tbl3.Fk = Tbl.Fk AND
                       Tbl3.Id = Tbl.Id) Tbl2  
ON Tbl1.Id = Tbl2.Fk
Run this code in QA and you will see the difference.








Borislav Borissov
VFP9 SP1, SQL Server 2000/2005.
MVP VFP
 
Thanks, that is some complicated coding!

Unfortunately, I cannot figure out how to apply it in my case, so I will try to give some more details. To simplify matters I will ignore the other fields in the main table and all the other related tables.

The main table, called Specimens, has a number of fields including a primary key column with the name ID. The secondary table has the following fields:

ImageID
SpecimenID
Image
ImageCreator
ImageCreated
ImageFilename

It is really a view combining two tables (Images and SpecimenImages), but never mind that for now. The SpecimenID refers to the ID column in the Specimens table and this is where the LEFT OUTER JOIN occurs.

I find it very hard to apply your example, because the total query is really complex and there is no possibility for declaring variables. I will keep on trying, but here is in any case the full query if you are interested...

Code:
SELECT     TOP (100) PERCENT petrander.Specimens.ID, 
                      '' + petrander.Institutions.InstitutionCode + CASE WHEN Collections.IncludeCollCode = 1 THEN ' ' + Collections.CollectionCode ELSE '' END + ' ' + LTRIM(RTRIM(ISNULL(STR(petrander.Specimens.AccessionYear)
                       + '.', ''))) + LTRIM(RTRIM(STR(petrander.Specimens.SerialNumber))) + ' ' + ISNULL(petrander.Specimens.SubNumber, '') + '' AS SpecimenNumber, 
                      CASE TaxonIdentifications.IdentificationQualifier WHEN 3 THEN Genera.GenusName + ' sp.' WHEN 0 THEN 'undetermined ' + HigherTaxa.HigherTaxonName
                       ELSE CASE WHEN Genera_1.Genusname IS NULL 
                      THEN Genera.GenusName + ' ' + Species.SpeciesName ELSE Genera.GenusName + ' ' + Genera_1.GenusName + ' ' + Species.SpeciesName END END
                       AS ScientificName, petrander.Specimens.Institution, petrander.Specimens.Collection, petrander.TaxonIdentifications.HigherTaxon, 
                      CASE WHEN Localities.LocalityParent > 0 THEN ParentLocalities.LocalityNameDK + ': ' + Localities.LocalityNameDK ELSE Localities.LocalityNameDK END
                       AS Locality, petrander.Stratigraphy.StratigraphyTextEN, petrander.ViewTaxonomy.Taxon, petrander.ViewTaxonomy.TaxonID, 
                      petrander.ViewTaxonomy.ParentID1, petrander.ViewTaxonomy.ParentID2, petrander.ViewTaxonomy.ParentID3, petrander.ViewTaxonomy.ParentID4, 
                      petrander.ViewTaxonomy.ParentID5, petrander.ViewTaxonomy.ParentID6, petrander.ViewTaxonomy.ParentID7, petrander.ViewTaxonomy.ParentID8, 
                      petrander.GeochronologyTerms.geoc_ID AS age_id, petrander.GeochronologyTerms.GeochronologyTermEN AS Age, 
                      petrander.GeochronologyTerms.Eon, petrander.GeochronologyTerms.Era, petrander.GeochronologyTerms.Period, 
                      petrander.GeochronologyTerms.Epoch, petrander.GeochronologyTerms.Subepoch, petrander.TypeStatus.TypeStatusName AS Type, 
                      petrander.Specimens.SerialNumber, petrander.Stratigraphy.StratigraphyTextEN AS Stratigraphy, petrander.Specimens.SubNumber, 
                      petrander.Collections.IncludeCollCode, petrander.Specimens.AccessionYear, petrander.Specimens.Sagsnummer, petrander.Specimens.TypeStatus, 
                      petrander.ImageView.Image
FROM         petrander.Institutions RIGHT OUTER JOIN
                      petrander.Specimens INNER JOIN
                      petrander.Collections ON petrander.Specimens.Collection = petrander.Collections.collc_ID ON 
                      petrander.Institutions.instc_ID = petrander.Collections.CollectionInstitution AND 
                      petrander.Institutions.instc_ID = petrander.Specimens.Institution LEFT OUTER JOIN
                      petrander.TypeStatus ON petrander.Specimens.TypeStatus = petrander.TypeStatus.type_ID LEFT OUTER JOIN
                      petrander.Stratigraphy LEFT OUTER JOIN
                      petrander.GeochronologyTerms ON petrander.Stratigraphy.GeochronologyTerm = petrander.GeochronologyTerms.geoc_ID LEFT OUTER JOIN
                      petrander.Localities AS ParentLocalities RIGHT OUTER JOIN
                      petrander.Localities ON ParentLocalities.loc_ID = petrander.Localities.LocalityParent ON 
                      petrander.Stratigraphy.Locality = petrander.Localities.loc_ID ON petrander.Specimens.Stratigraphy = petrander.Stratigraphy.str_ID LEFT OUTER JOIN
                      petrander.Species RIGHT OUTER JOIN
                      petrander.Genera AS Genera_1 RIGHT OUTER JOIN
                      petrander.HigherTaxa RIGHT OUTER JOIN
                      petrander.TaxonIdentifications ON petrander.HigherTaxa.hitax_ID = petrander.TaxonIdentifications.HigherTaxon ON 
                      Genera_1.genus_ID = petrander.TaxonIdentifications.Subgenus LEFT OUTER JOIN
                      petrander.ViewTaxonomy RIGHT OUTER JOIN
                      petrander.Genera ON petrander.ViewTaxonomy.TaxonID = petrander.Genera.ParentTaxon ON 
                      petrander.TaxonIdentifications.Genus = petrander.Genera.genus_ID ON petrander.Species.spec_ID = petrander.TaxonIdentifications.Species ON 
                      petrander.Specimens.TaxonIdentified = petrander.TaxonIdentifications.taxid_ID LEFT OUTER JOIN
                      petrander.ImageView ON petrander.Specimens.ID = petrander.ImageView.SpecimenID
WHERE     (petrander.Genera.GenusName LIKE 'Scaphites')
ORDER BY petrander.Specimens.Institution, petrander.Specimens.AccessionYear, petrander.Specimens.Sagsnummer, petrander.Specimens.SerialNumber, 
                      petrander.Specimens.SubNumber


Fedor Steeman
Geological Museum Copenhagen
Denmark
 
I am sorry I really cannot figure out how to write this join. I can't see where e.g. Tbl1 is referring to and where Tbl3.FK comes from...

This is the best that I could make of it

Code:
LEFT JOIN (SELECT Tbl3.* FROM ImageView Tbl3
                    INNER JOIN (SELECT MAX(SpecimenID) AS spId, FK FROM ImageView GROUP BY Fk) Tbl
                    ON Tbl3.Fk = Tbl.Fk AND
                       Tbl3.spId = Tbl.spId) Tbl2  
ON Tbl1.Id = Tbl2.Fk

Fedor Steeman
Geological Museum Copenhagen
Denmark
 
From what I see (correct me if I am not right), You want only ONE field from that file (I suppose the fiel is ImageView), If it is so change the last join to:
Code:
SELECT ...............
       ImageView.[Image] --- No DB name in-front of table
FROM .........
.....
LEFT JOIN (SELECT SpecimenID, MAX([Image]) AS [Image])
                  FROM petrander.ImageView
                  GROUP BY SpecimenID)ImageView
ON petrander.Specimens.ID = ImageView.SpecimenID

Borislav Borissov
VFP9 SP1, SQL Server 2000/2005.
MVP VFP
 
You are correct and your suggested LEFT JOIN (after some minor modifications) really works !

Code:
LEFT OUTER JOIN (SELECT SpecimenID, MAX(ImageID) AS imgID FROM petrander.ImageView AS ImageView_1 GROUP BY SpecimenID) AS ImageView ON petrander.Specimens.ID = ImageView.SpecimenID

Gosh I am so glad I got past this! Once again you saved my day, Borislav! I really still have to become a little more skilled with this SQL...

Fedor Steeman
Geological Museum Copenhagen
Denmark
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top