How to check for duplicate entries in a file and remove the duplicate?

JTan · Oct 20, 2005

As above.

TIA!

PHV · Oct 20, 2005

man sort (the -k and -u options)

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ181-2886

TrojanWarBlade · Oct 20, 2005

sort will, of course, resequence your file.
If this is not a problem then all well and good but if it is a problem then you have two possible solutions.
1) write a script to remove dupes using an associative array of some kind to see if the record to be output has been seen before.
2) use "seq" to create an index and "join" it to the end of each record in the first file. Then use "sort" to resequence AND remove dupes. Next sort again on the index we created (to sort back to the original sequence). Finally "cut" the index off again. Hey presto, your file is de-duped but not resequenced.

Trojan.

Ogzilal · Oct 21, 2005

Hi,

see thread

http://www.tek-tips.com/viewthread.cfm?qid=1126100&page=3

Code:

awk '!a[$1]++' /path/to/your_file

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

How to check for duplicate entries in a file and remove the duplicate?

JTan

Technical User

PHV

MIS

TrojanWarBlade

Programmer

Ogzilal

MIS

Similar threads

Part and Inventory Search

Sponsor