Query that will exclude duplicate records

theniteowl · Apr 24, 2012

I have an unusual problem.
One of our applications produces csv files with daily job information. Every month a script concatenates all of those daily csv files into one large one and ftps it to a network share.
I have an Access database that runs queries against the csv file as a linked table to generate monthly metrics.

The problem is the application went nuts for a week or so and created between 140 and 162 duplicate entries for every job that ran through the server that week so the numbers are WAY off.
Is there any easy way for me to eliminate the duplicate records in the existing queries or will I have to setup something to process and write out new data files without the duplicates first?
As they are linked files I cannot save changes to them and they are way too large for me modify in Excel as they greatly exceed the max number of lines Excel can handle.

Any thoughts?
My alternative is to write a VBS script that will parse the file line by line looking for the duplicate randmom number ID each entry has and each duplicate has identically and write each new non-duplicate out as a text file under a new name.

Thanks.

At my age I still learn something new every day, but I forget two others.

PHV · Apr 24, 2012

Perhaps a SELECT DISTINCT query ?

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886

theniteowl · Apr 25, 2012

DISTINCT would not work for me. Apparently whatever happened to the app the first 10 or so fields in each record matched but the rest varied greatly.
I ended up modifying each days csv file manually in Excel to eliminate duplicates based on the random number field the database assigned as those always matched duplicated records.

I had hoped to be able to modify the existing Access query to just disregard the duplicates whenever it ran so that if this happens in the future I do not have to worry about the duplicates as I may not even know if they had ocurred.

This is a very messed up system going to csv files that then get concatendated at the end of the month to one big csv and linked to Access. I plan to setup a database to read the data into and then have a web app for generating reports at will. Right now it is too much hands on and manual manipulation of data.

At my age I still learn something new every day, but I forget two others.

lameid · Apr 25, 2012

theniteowl,

You have not supplied enough information to help.

What is an example of your situation of data you want to keep or lose?

On the surface it sounds like you can import the data adding an autonumber field and using a combination of In and Select Min(ID) in a subquery you should be able to find the version you want to keep.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Query that will exclude duplicate records

theniteowl

Programmer

PHV

MIS

theniteowl

Programmer

lameid

Programmer

Similar threads

Part and Inventory Search

Sponsor