Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Checking for duplicate data 1

Status
Not open for further replies.

Ligia

Technical User
Jul 3, 2001
4
GB
I wonder if there's a formula that would work out duplicate data on a spreadsheet? What I have is a spreadsheet with roughly 41,000 rows. Each row is a record, containing Name, and Surname. I know that there are quite a few of these records (rows) which have the same surname and firstname. How can I work out a formula which will highlight all the surnames which are the same? i.e. 2 Jones or 3 Smiths? If there's something which would only make it highlight if the first name was the same i.e. It would highlight two John Smiths, but not Mark Smith (if only 1 Mark Smith).

Any help would be appreciated.

Many thanks.

Ligia
 
You could do it with VBA code but . . .

1) Combine Name and Surname into one column.
2) Sort ALL data by new column.
3) Write formula in yet another column that checks the prior row. If the same, return a string like "duplicate".
Example : =IF(C10=C11,"Duplicate","OK")
4) Turn on Data Filter, starting at the upper left corner of your data. Filter for the word Duplicate.
5) Delete all rows found.
6) Turn off filter and have a look.

You'll be done in 10 minutes.
 
I know this is going to make me sound incredibly stupid, but how do I merge both cells into one. I'm using Excel 97.

Thanks yet again.

Ligia
 
In cell C10, type this formula :
=A10&B10 or B10&A10 (by surname)
and then copy for all records.

You are only using this column to combine name and surname so that duplicates can be found. When finished, column C and the formula column D can be deleted.

Sorry about the word combine. I should have said 'concatenate', but I was too lazy to look up how to spell it.
 
I just had the same problem and worked around it using an array formula. I copied the column with all my data in it, so that i had 2 identical columns. Then in a 3rd column i typed in the following formula-

(assume im using columns a,b,c with 10 rows)

=COUNT(IF(A1=$B$1:$B$10,))

As its an array formula you need to enter it by pressing CTRL+SHIFT+ENTER, it will then add {} around the formula.

I then copied this all the way down the 3rd column. Basically what it does is to count how many times a value in column 1 appears in column 2. You can then use the autofilter on the Data menu to find what you want.
 
Phil,

I've noticed that no one has responded to your posting, and I'm expecting that the reason is because the originator of the original posting has indirectly answered your question (Did you find a solution?) - by issuing JVFriederick a "STAR".

These STARS are issued by clicking on the "Click here to mark this post as a helpful or expert post!" - located at the bottom-left-corner of the contributor's posting. It is the accepted method of showing appreciation and recognition for having submitted a helpful solution.

Because the solution "appears" to have already been provided, no one appears anxious to "jump in" with other suggestions, especially given that you have not clearly identified how your problem differs from the original problem.

My recommendation would be that you submit a TOTALLY NEW posting, and describe your own specific situation.

Hope this helps "get the ball rolling".

Regards, ...Dale Watson dwatson@bsi.gov.mb.ca
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top