Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Need To Compare Multiple Fields to verify no Duplicate 2

Status
Not open for further replies.

kjv1611

New member
Jul 9, 2003
10,758
US
I previously was checking some data, based on 2 fields (since no one single field can be an unique ID in this data to itself) to check for duplicates.

However, I just recently realized that I actually need to check for at least 5 fields, b/c it is possible (not very common, but does occur) that there are some instances where this does occur.

So, I'm coming here, b/c I REALLY do not want to build a procedure with 5 stinking DLookup statements embedded within each other, and I'm not particularly sure of an alternative SQL statement I might could use.

For the sake of argument, if nothing else, can anyone take any shots at a way to do what I'm talking about? I'll just create some fake data, and put below, and then maybe some of the rest of you folks will have some ideas:

A little more info on the real world usage. I'm importing data into one table, running some various processes on the data, and then importing it into a "historical" table in order to verify that the data set I get the next instance does not duplicate this data (in other words, I've already had MUCH duplicates given in the past, and I need to filter them out each time). Makes sense, I'm sure! [wink]

[tt]
----------------------------------------------------
-----------------SAMPLE DATA------------------------
----------------------------------------------------
Tables
-------------
Table1 Name = [blue]tblData[/blue] - The processed data table.
Table2 Name = [blue]tblHist[/blue] - Historical table for comparing "the next time", and the next..
----------------------------------------------------
Fields (identical to both tables)
-------------
- I have a couple additional fields in the history table, but I can figure out my dealings with those on my own. One is an autonumber field.

Field Name - Format
---------------------
[blue]Date[/blue] - Date
[blue]File No[/blue] - Text
[blue]LastName[/blue] - Text
[blue]MI[/blue] - Text
[blue]Firstname[/blue] - Text
----------------------------------------------------
----------------------------------------------------
[/tt]

And if it's something simple, that's even better. I'd pull my hair out if I had any! [SMILE]

I am kind of hoping that I'm just forgetting something very simple, and that one of you will prove it.

Or else, I may be back to the drawing board with 3 embedded "If Dlookup" statements - SCAAAARY!

I don't care if whether it's using an Access Query, a SQL statement, or some other VBA usage. I'm kind of thinking that I'd be better off doing a SQL statment with that much comparing, but hopefully someone can direct me either way.

Thanks in advance for anything,

kjv

--

"If to err is human, then I must be some kind of human!" -Me
 
Sorry, I gave the table/field definitions, but no data. So, I'll post that next:
[tt]
Field Name - Value
---------------------

Rec#1:
Date - 1/1/2009
File No - CC-09-393949
LastName - Smith
MI - M
Firstname - Joe
---------------------
Rec#2:
Date - 1/1/2009
File No - CC-09-393949
LastName - Smith
MI - M
Firstname - Joe
---------------------
Rec#3:
Date - 1/1/2009
File No - CC-09-393949
LastName - Smith
MI - M
Firstname - Joe
---------------------
Rec#4:
Date - 1/1/2009
File No - CC-09-393949
LastName - Smith
MI - R
Firstname - Susie
---------------------
Rec#5:
Date - 1/1/2009
File No - CC-09-393949
LastName - Smith
MI - L
Firstname - Joe
---------------------
Rec#6:
Date - 1/1/2009
File No - CC-09-393949
LastName - Jetson
MI - L
Firstname - George
---------------------
Rec#7:
Date - 1/4/2009
File No - CD-09-303949
LastName - Jones
MI - N
Firstname - Matthew

[/tt]

Oh, and with the duplicate deal, it's not likely (although I can't say it's impossible) that there would be entire duplicates within the "new" data each time. The duplicates occur when they occasionally send me data which is exact duplicates of what I've already gotten.

Hmm, maybe this will help as well:
Specifically, this past year, every month, I'd get a big workbook containing all the data "to date." So if I already got data from them the last month, then this month's data also contains what they sent me the last month.

I hope this isn't sounding like a raving lunatic. [wink]

--

"If to err is human, then I must be some kind of human!" -Me
 
Oh, and this is something I'm doing for work if no one can guess by the field names. And I'm about to check out for the day. However, if I can find the time at all this evening, I'll be checking back in (between things outside my full-time job), b/c I really want to nip this one in the bud.

Thanks again for any possible help/advice.

--

"If to err is human, then I must be some kind of human!" -Me
 
Without digging too far into it this early in the morning, is it something where you could join the tables, then do a SELECT DISTINCT?



Just my 2¢

"What the captain doesn't realize is that we've secretly replaced his Dilithium Crystals with new Folger's Crystals."

--Greg
 
If I can somehow do a SELECT DISTINCT to where it's DISTINCT based on 5 fields, not just one, then I think so.

How would I go about that, if that's the case.

So, it's sort of 2 pieces of one problem:
[OL][LI]The "Process" table fields are not, or else should not already be in the "History" table before the query.[/LI]
[LI]To make sure the Process fields are not in the History table, I really need to compare against 5 fields.[/LI]
[/OL]

I ended up posting some SQL code I was trying on temp tables on SQL Server over in the SQL Programming forum here: thread183-1537897 in case that's easier to read.

I'm terribly sorry I put so much info down. Just had a hard time shortening the whole thing at the moment...

--

"If to err is human, then I must be some kind of human!" -Me
 
Well, assuming I'll be able to use the SQL within my VBA, I did get a solution over at: thread183-1537897

Feel free to read it if you dare. [wink] Or if you want to read it in short, read [blue]TheBugSlayer[/blue]'s response, or specifically just look at his SQL code.

--

"If to err is human, then I must be some kind of human!" -Me
 
Hi kjv1611,

There is an easier solution than using any SQL or VBA.

I do something similar but with four fields. Make all five fields in your table a Primary key. Because you cannot have a duplicate primary key it automatically checks for duplicates. Any of the fields on its own can be duplicated but not as a group.

Hennie.
 
henniec,

How do I make them all primary keys? Are you talking about creating a new field and putting them altogether, or are you talking about somehow grouping the fields?

--

"If to err is human, then I must be some kind of human!" -Me
 
In the table designer, select all the fields you wish to be unique (Select one, hold down ctrl and select the others) and click the Primary Key option.

hth

Ben

----------------------------------------------
Ben O'Hara
David W. Fenton said:
We could be confused in exactly the same way, but confusion might be like Nulls, and not comparable.
 
Ok, What I did once... many moons ago.
I concatenated 50+ fields into a single string field; then did a simple query to compare. In my case, I did not need to know which field was different; just that the record failed. Since the exceptions would be very few; users offerred to manually compare records that failed.


Steve Medvid
IT Consultant & Web Master

Chester County, PA Residents
Please Show Your Support...
 
I'll try your suggestion, henniec and oharab.

Will post back shortly.

Thanks.

--

"If to err is human, then I must be some kind of human!" -Me
 
While I'm doing this, I keep wondering - are you SURE that this doesn't place a Primary ID on EACH field, rather than the combination thereof?

I know I've done it the way smedvid mentioned, but for the particular data, the field could end up being quite large.

I've never done it the way of just selecting multiple fields and setting as primary key.

So, you're 100% sure that it works like this?
Field1+Field2+Field3+Field4+Field5 is now the "Primary ID"... Therefore if one of those individual fields is duplicated, that's okay, but the combination of all 5 fields cannot be duplicated.

Is that what you guys are saying? If so, do you know of any references that point along that line?

--

"If to err is human, then I must be some kind of human!" -Me
 
Actually, what am I thinking?

The other method, as posted over in the SQL forum will surely be the BEST method, regardless of whether easiest.. [smile] I must've totally forgotten over the weekend.

Anyway, I'll finish up with the other, and if I can't get that to work in VBA, then I'll revisit this multiple primary ID idea.

And I'm curious about it anyway, as I'd like to verify whether that's the case or not.

Will post back ASAP.

--

"If to err is human, then I must be some kind of human!" -Me
 
That's exactly what we're saying :)

It's a fairly standard SQL construct and is known as a composite key.

Ben

----------------------------------------------
Ben O'Hara
David W. Fenton said:
We could be confused in exactly the same way, but confusion might be like Nulls, and not comparable.
 
Why would the other way be better?
Creating a complex surrogate key is no better & some may argue worse than a composite key! Setting a composite key and using the actual database has to be an improvement over creating a piece of VBA to loop through all your records to insert the new key!

Ben

----------------------------------------------
Ben O'Hara
David W. Fenton said:
We could be confused in exactly the same way, but confusion might be like Nulls, and not comparable.
 
Yes as you metioned in your second last reply field1+field2+filed3+field4+field5 combined become one primary key. You must select them all and then press the primary key to show you the primary key symbol next to each one. Four out of the five can have duplicate but when all five have a duplicate it will not accept the result.

This method will tell you immediately after you pressed enter if you have a duplicate entry. No need to wait for a query or some code to run before you know. If you run a query the record may be accepted before the query is run.

Hennie
 
Thanks. Composite key - that's what I needed. I'll definitely look at using that as well.

Yes, using a VBA loop would indeed take longer. But I'm not using a VBA loop, but rather going to stick a SQL Query into VBA which will do the comparisons with a WHERE clause rather than keeping up with a Composite Key.

I'll do some looking into that though. It does seem to me (just in trying to understand it) that the whole composite key idea would end up being more system resource intensive down the road rather than one query each time the deal is run.

But thanks for any info, again. I didn't even know what a Composite Key was before this. [smile]

--

"If to err is human, then I must be some kind of human!" -Me
 
Hmm, the SQL code seemed to work 100% perfect on SQL Server, but not within VBA. Can anyone look at this snippet of code, and tell me why it will not work in Access? Or if there is something I've got incorrect that I'm just not seeing?

Code:
Public Sub UpdateHist()
On Error GoTo ErrHandle
    Dim strSQL As String

    strSQL = "[BLUE]INSERT INTO[/BLUE] tblHist_Data([Date] ,[CaseNum] " & Chr(13) & Chr(9) & _
                    ",[LName] ,[MI] ,[FName] ,[State] " & Chr(13) & Chr(9) & _
                    "[BLUE]SELECT[/BLUE]  i.[Date] ,i.[CaseNum] " & Chr(13) & Chr(9) & _
                    ",i.[LName] ,i.[MI] ,i.[FName] ,i.[State]) " & Chr(13) & Chr(9) & _
                    "[BLUE]FROM[/BLUE]   tblImport i " & Chr(13) & _
                    "[BLUE]WHERE[/BLUE] [HIGHLIGHT WHITE]Not Exists[/HIGHLIGHT]  " & Chr(13) & Chr(9) & _
                    "([BLUE]SELECT TOP 1 0 FROM[/BLUE] tblHist_Data  " & Chr(13) & Chr(9) & _
                    "[BLUE]WHERE[/BLUE]  (i.[Date] = tblHist_Data.[Date]) [GRAY]AND[/GRAY]  " & Chr(13) & Chr(9) & _
                    "(i.[CaseNum] = tblHist_Data.[CaseNum]) [GRAY]AND[/GRAY]  " & Chr(13) & Chr(9) & _
                    "(i.[LName] = tblHist_Data.[LName]) [GRAY]AND[/GRAY]  " & Chr(13) & Chr(9) & _
                    "(i.[MI] = tblHist_Data.[MI]) [GRAY]AND[/GRAY]  " & Chr(13) & Chr(9) & _
                    "(i.[FName] = tblHist_Data.[FName])  " & Chr(13) & Chr(9) & _
                    ")  " & Chr(13) & _
                    "GO" & Chr(13)

    DoCmd.SetWarnings False
    DoCmd.RunSQL strSQL, True
    DoCmd.SetWarnings True

ExitSub:
On Error Resume Next
    Exit Sub
    
ErrHandle:
    ErrTalk (Forms!frmImport!txtImportID)
    Resume ExitSub

End Sub


A Debug.Print on the strSQL variable gives me this, in case it helps any:
Code:
INSERT INTO tblHist_Data([Date] ,[CaseNum] 
    ,[LName] ,[MI] ,[FName] ,[State]) 
    SELECT  i.[Date] ,i.[CaseNum] 
    ,i.[LName] ,i.[MI] ,i.[FName] ,i.[State] 
    FROM   tblImport i 
WHERE Not Exists  
    (SELECT TOP 1 0 FROM tblHist_Data  
    WHERE  (i.[Date] = tblHist_Data.[Date]) AND  
    (i.[CaseNum] = tblHist_Data.[CaseNum]) AND  
    (i.[LName] = tblHist_Data.[LName]) AND  
    (i.[MI] = tblHist_Data.[MI]) AND  
    (i.[FName] = tblHist_Data.[FName])  
    )  
GO

And currently that SQL runs fine on SQL Server 2005, but in Access VBA, running as a SQL statement it does not run correctly.

The error information follows:
ErrNumber = 3075
ErrDescription = Syntax error (missing operator) in query expression
[gray]'Not Exists
(SELECT TOP 1 0 FROM tblHist_Data
WHERE (i.[Date] = tblHist_Data.[Date]) AND
(i.[Case Number] = tblHist_Data.[Case Number]) AND
(i.[Plaintiff Last Name] = tblHist_Data.[Plaintiff Last Name]) AND
(i.[MI] = tblHist_Data.[MI]) AND '.[/gray]


So, given all of that information, is it safe to say that the "Not Exists" construct does not work from within VBA instances of SQL, or am I just overlooking some really bad typo?

Thanks for any picks/advice on this.

--

"If to err is human, then I must be some kind of human!" -Me
 
Try fixing the typo:
Code:
SELECT TOP [b]10[/b] FROM tblHist

B.

----------------------------------------------
Ben O'Hara
David W. Fenton said:
We could be confused in exactly the same way, but confusion might be like Nulls, and not comparable.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top