Deleting all duplicates but the last found...

tpbjr · Oct 28, 2004

Ok I now have my files in sorted order as I need them. Below is a sample of the data as it is sorted by the requirements. The first field determines if it is duplicates. The data is sorted based on the filenames where the data came from and this order must be preserved.

Z100035|USF Holland Test Inc
Z100036|Meijer Test Corp
Z100035|USF Reddway Test Inc
Z100038|USP Corp
Z100039|Frontier Inc
Z100038|USP Employee Corp
Z100040|Test1
Z100040|Test2
Z100040|Foster Inc
Z100041|Kentucky Inc
Z100042|TEST3
Z100043|KPMG Inc

Here is the result that I am looking for:

Z100036|Meijer Test Corp
Z100035|USF Reddway Test Inc
Z100039|Frontier Inc
Z100038|USP Employee Corp
Z100040|Foster Inc
Z100041|Kentucky Inc
Z100042|TEST3
Z100043|KPMG Inc

Notice only the last row of each duplicate is left. This row is from the most recent file (file user created when they made an update or an addition).

Thank you for your help.
I am new in the Unix Oracle world so I do appreciate all your help. And again if you need help with Visual Basic or MS Access I can help.

Thank you for all your help

http://www.besware.com

Tom

scriptdan · Oct 28, 2004

##The first part (awk) inside the for loop gets the unique keys
for id in $(awk -F"|" '{print $1}' yourfile |sort -u)
do
grep "$id" yourfile | tail -1
done

result

Z100035|USF Reddway Test Inc
Z100036|Meijer Test Corp
Z100038|USP Employee Corp
Z100039|Frontier Inc
Z100040|Foster Inc
Z100041|Kentucky Inc
Z100042|TEST3
Z100043|KPMG Inc

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Deleting all duplicates but the last found...

tpbjr

MIS

scriptdan

Technical User

Similar threads

Part and Inventory Search

Sponsor