Duplicate records

hugheskbh · Jan 5, 2004

How can I find duplicate records in a file using Unix. For instance, I have a file with employee information. How can I find records with duplicate employee numbers in the file.

Thanks

Ken

Salem · Jan 5, 2004

> How can I find duplicate records in a file using Unix.

Code:

sort -u

sorts a file and eliminates duplicate records.

So from a sorted file, and a sorted -u file, you can use diff to find the duplicate records.

--

derekludwig · Jan 6, 2004

If the records are sorted [sort -o file file],
then I suggest uniq:

uniq -d file

will give you a single instance of every duplicated (and triplicated, quadruplicated, etc) record. If you need to know how many times a record was duplicated:

uniq -dc file

Some uniq's support the -D option, spew out the second and subsequent duplicated records.

----------
Derek

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Duplicate records

hugheskbh

Programmer

Salem

Programmer

derekludwig

Programmer

Similar threads

Part and Inventory Search

Sponsor