Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Duplicate records

Status
Not open for further replies.

hugheskbh

Programmer
Dec 18, 2002
37
0
0
US
How can I find duplicate records in a file using Unix. For instance, I have a file with employee information. How can I find records with duplicate employee numbers in the file.

Thanks

Ken
 
> How can I find duplicate records in a file using Unix.
Code:
sort -u
sorts a file and eliminates duplicate records.

So from a sorted file, and a sorted -u file, you can use diff to find the duplicate records.

--
 
If the records are sorted [sort -o file file],
then I suggest uniq:

uniq -d file

will give you a single instance of every duplicated (and triplicated, quadruplicated, etc) record. If you need to know how many times a record was duplicated:

uniq -dc file

Some uniq's support the -D option, spew out the second and subsequent duplicated records.

----------
Derek
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top