Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Need To Compare Multiple Fields to verify no Duplicate 2

Status
Not open for further replies.

kjv1611

New member
Jul 9, 2003
10,758
US
I previously was checking some data, based on 2 fields (since no one single field can be an unique ID in this data to itself) to check for duplicates.

However, I just recently realized that I actually need to check for at least 5 fields, b/c it is possible (not very common, but does occur) that there are some instances where this does occur.

So, I'm coming here, b/c I REALLY do not want to build a procedure with 5 stinking DLookup statements embedded within each other, and I'm not particularly sure of an alternative SQL statement I might could use.

For the sake of argument, if nothing else, can anyone take any shots at a way to do what I'm talking about? I'll just create some fake data, and put below, and then maybe some of the rest of you folks will have some ideas:

A little more info on the real world usage. I'm importing data into one table, running some various processes on the data, and then importing it into a "historical" table in order to verify that the data set I get the next instance does not duplicate this data (in other words, I've already had MUCH duplicates given in the past, and I need to filter them out each time). Makes sense, I'm sure! [wink]

[tt]
----------------------------------------------------
-----------------SAMPLE DATA------------------------
----------------------------------------------------
Tables
-------------
Table1 Name = [blue]tblData[/blue] - The processed data table.
Table2 Name = [blue]tblHist[/blue] - Historical table for comparing "the next time", and the next..
----------------------------------------------------
Fields (identical to both tables)
-------------
- I have a couple additional fields in the history table, but I can figure out my dealings with those on my own. One is an autonumber field.

Field Name - Format
---------------------
[blue]Date[/blue] - Date
[blue]File No[/blue] - Text
[blue]LastName[/blue] - Text
[blue]MI[/blue] - Text
[blue]Firstname[/blue] - Text
----------------------------------------------------
----------------------------------------------------
[/tt]

And if it's something simple, that's even better. I'd pull my hair out if I had any! [SMILE]

I am kind of hoping that I'm just forgetting something very simple, and that one of you will prove it.

Or else, I may be back to the drawing board with 3 embedded "If Dlookup" statements - SCAAAARY!

I don't care if whether it's using an Access Query, a SQL statement, or some other VBA usage. I'm kind of thinking that I'd be better off doing a SQL statment with that much comparing, but hopefully someone can direct me either way.

Thanks in advance for anything,

kjv

--

"If to err is human, then I must be some kind of human!" -Me
 
Should you go for the primary composite key you may find that it would not work as it may tell you there is duplicate records. Run the find Duplicate query and delete all the duplicates and try again. If no duplicates are found it would create the composite and no future dupicates will be entered.

I dont think it uses more resource memory as any VBA or SQL.

Hennie
 
oharab said:
Try fixing the typo
Thanks, but according to how it ran in SQL, it is not a typo. Frankly, I tried to ask and see why they did a "TOP 0 1" from TheBugSlayer in the SQL forum thread, but never got a response on that. A TOP 10 construct wouldn't work, I know. A TOP 1 I understand, but I have no idea why the 0 is there. And of course, if I try to Google it, Google gives me results for "TOP 10" just like you were thinking. [pullhair] though I don't have any hair. [wink]

henniec,
Yeah, regardless of which uses the most resources or not, I may end up just going with that one anyway. Actually, I've got just an empty table for now so far for this year. I didn't get anything for the year until recently, so I don't have to worry about existing duplicates anyway. I'm just trying to nip it in the bud in advance, since I know how their system works now. [wink] And given the specific data, I think it is probably most beneficial that they do provide 100% of the data each time (for some people), in case there is any missed along the way.

--

"If to err is human, then I must be some kind of human!" -Me
 
What about this ?
Code:
strSQL = "INSERT INTO tblHist_Data([Date],CaseNum,LName,MI,FName,State)" & vbCrLf _
 & "SELECT i.Date,i.CaseNum,i.LName,i.MI,i.FName,i.State" & vbCrLf _
 & "FROM tblImport i WHERE Not Exists (SELECT * FROM tblHist_Data h" & vbCrLf _
 & "WHERE i.Date=h.Date AND i.CaseNum=h.CaseNum AND i.LName=h.LName AND i.MI=h.MI AND i.FName=h.FName)"

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
PHV,

I do think it would work without the "TOP.." bit.

Do you have any thoughts on what the "TOP 1 0" construct does? It worked perfectly over on SQL Server 2005.

Thanks for any reference/advice/info.

I'll try it as you stated, and post back.

--

"If to err is human, then I must be some kind of human!" -Me
 
Thanks, PHV.

I tried that method, but it's giving me the same syntax error:

Run-time error 3075, Syntax error in query expression 'Not Exists(SEELCT TOP *FROM.....

--

"If to err is human, then I must be some kind of human!" -Me
 
OH MY............

Well, I found the problem, and am I embarrassed! [blush]

Apparently, you can't use the GO statement in SQL statements within VBA. I just ran a test of it with taking that out, and it ran flawlessly - well the one without the WHERE clause...

Well, it did RUN, but now it's not updating all the records, since I did the "Composite ID" as suggested with the table -I was able to run the SQL with and without the where clause.

So next I'll remove the Composite ID....

AHHHHHHHHHHHHHH! Yes!

It ran PERFECTLY without the Composite ID, and even though I had already put in about a third of the records, when I ran it without the composite ID AND included the WHERE SQL clause, it ran perfectly and smoothly - no errors, and the exact data I was wanting it to return! Yippee!!

Thanks all for your suggestions and attempts.

By the way, (I don't know why the composite ID doesn't work in this situation), the Composite ID usage did run much slower than just using the SQL statement with Where clause. I'm sure that running the SQL statement would be no different than running a query, so it's definite a performance difference (unless it just HAPPENED to be slower/faster at the particular moment - I doubt it, as it's just on my local PC, no network involved at the moment.

--

"If to err is human, then I must be some kind of human!" -Me
 
Now I would still like to know what this does:
Code:
SELECT TOP 1 O

Any thoughts here? I'll see if I can get a response in the SQL thread as well...

--

"If to err is human, then I must be some kind of human!" -Me
 
kjv,
1) my suggestion got rid of the GO statement.
2) SELECT TOP 1 0 FROM ... is fairly easy to understand:
SELECT 0 FROM will return a single column with value 0 for each row meeting the WHERE clause, and thus the TOP 1 predicate will only return a single row whenever the criteria is meet.

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
Thanks for that clarification, PHV. Oh, and I missed the missing GO part earlier - would've saved a few hairs, I suppose. [wink]

So, when it pulls a value of "0", the way it's pulling in this situation, it wouldn't put the data anywhere, since it's just in the WHERE clause. I like that.

--

"If to err is human, then I must be some kind of human!" -Me
 
I did end up needing to add a DISTINCT to the query for some reason. I didn't think it needed it, but it seems to make a difference after several trial runs in the last little while.

--

"If to err is human, then I must be some kind of human!" -Me
 
CRUD! I was looking at the wrong table! It's still overwriting duplicates for some reason! Good Grief! This is just RE-TAR-DED!

I'm going to pinch myself, 'cause surely it's just a bad dream! [hairpull]

--

"If to err is human, then I must be some kind of human!" -Me
 
Well, I tried the LEFT JOIN idea from the SQL forum thread, and that gave the same results. I'm getting duplicates even after all this work.

Anybody got any other thoughts?

The composite ID did not work, as it is not allowing duplicates, but it is also not allowing some values which are NOT duplicates based on all 5 fields. That leads me to believe that the composite ID is actually comparing the fields one at a time, rather than a combination.

Goodness, it seems so simple, and I thought it was working correctly before, but now I'm floating out in limbo!

Argh!

Perhaps I need to step back, post the original problem as a new thread, and start on this again, perhaps bring in some fresh ideas..

--

"If to err is human, then I must be some kind of human!" -Me
 
I found the problem in the query using the NOT EXISTS statement.

The query worked perfectly as is on SQL Server 2005. However, it did NOT work with jet SQL.

The reason is that I used an alias all the way through from the main query into the sub query. That is a NO NO. I just wasn't thinking.

So, I corrected that, and all is well! [smile]

--

"If to err is human, then I must be some kind of human!" -Me
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top