Removing duplicate words in a file

ianicr · Jan 14, 2004

I have a file that is:
"MR","I","SMITH","1","NEW STREET","LONDON","LONDON"

I wish to remove the first london but leave the second. Is there an easy way to do this?

SteveR77 · Jan 14, 2004

Question -

Given the format of the data being

"MR","I","SMITH","1","NEW STREET","LONDON","LONDON"

You would want to keep all the data in the case of a record

like this one -

"MR","R","JONES","1","NEW STREET","GREENWICH","LONDON"

Correct?

ianicr · Jan 14, 2004

Yep. That's just what I need. Not too bothered what goes into the field as long as its not the same as the field after.

Ygor · Jan 14, 2004

Perhaps use sed....

sed 's/$,"[^"]*"$\1/\1/g' file1 > file2

aigles · Jan 14, 2004

If you want to remove field 6 ("," separator) if it is the same as field 7 ...

with sed :

sed -e /s/^$\([^,]*,$\{5\}\)$[^,]*$,\3$/\1,\3/' input >output

with awk :

awk 'BEGIN {FS="," ; OFS=","} $6==$7 {$6=""; print $0}' input >output

Jean Pierre.

SteveR77 · Jan 14, 2004

If you need to preserve the field place holder but not the value that it contains you could do the following -

sed 's/$,"[^"]*"$\1/,""\1/g' file1>file2

PHV · Jan 14, 2004

Try this:

Code:

awk -F, '
$6==$7{$6=&quot;\&quot;\&quot;&quot;}
{print}
' </path/to/inputfile

Hope This Help
PH.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Removing duplicate words in a file

ianicr

IS-IT--Management

SteveR77

Programmer

ianicr

IS-IT--Management

Ygor

Programmer

aigles

Technical User

SteveR77

Programmer

PHV

MIS

Similar threads

Part and Inventory Search

Sponsor