Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Listing Duplicates 3

Status
Not open for further replies.

percent

IS-IT--Management
Apr 27, 2004
176
US
Heres what I'd like to do - I have that is about 50 lines in lenth(this number varies) an example is below:

243 bob
340 sean
292 sam
250 sue
390 tom
295 sam
242 bob

so far I am using |sort -d -k2
in order to sort the list according to the usernames
I know you can remove duplicates with |sort -u
but how would I get the opposite output

so after I would do |sort -d -k2
it would be as below:

242 bob
243 bob
292 sam
295 sam
340 sean
250 sue
390 tom

but how would I make is so that it is

242 bob
243 bob
292 sam
295 sam

(Any help would be greatly appreciated)




%, 2004
 
How about this. Say your original file is called [tt]file.dat[/tt]. Try this...
Code:
sort -d -k2 file.dat > file.sorted
uniq -u -s4 file.sorted > file.uniq
comm -3 file.sorted file.uniq
This basically uses [tt]comm[/tt] to pull the uniq ones out. See the man pages for [tt]sort[/tt], [tt]uniq[/tt], and [tt]comm[/tt] for more info.

And I'm sure there's a better, cleaner, faster way to do it! [smile]

Hope this helps.
 
Thank you very much I'm new to Linux - I appreciate the help I ended up doing the below:
(thank you for pointing me toward uniq

If case anyone else would like to know what I did see below

/test |sort -d -r -k2 |uniq -f1 -d >> tmpstore.001
/test |sort -d -k2 |uniq -f1 -d >> tmpstore.001
sort -d tmpstore.001
rm tmpstore.001

(Again thank you)


%, 2004
 
ok I have a problem now the code was working but then I stopped receiving output so I tried to figure out which command the data was being lost and it is the |uniq -f1 -d portion of the code - what would cause the command to stop working? I know it stopped working because for testing purposes I made a script named test:

echo 243 bob
echo 340 sean
echo 292 sam
echo 250 sue
echo 390 tom
echo 295 sam
echo 242 bob

I used this for testing purposes - and the code that worked at one time no longer works why is this?

I tried to test it out \/ but it will no longer works -
Code:
./test |uniq -f1 -d
any ideas?

%, 2004
 
It looks like it stopped working because uniq only works on sorted files. Also "test" is not a good name for a script because it is also the name of a shell builtin.

A different approach would be to create a function to do the opposite of uniq...
[tt]
$ mv test test.sh
$ function dupe {
> awk $* '$k==y{print (x==y?"":z RS) $0}{x=y;y=$k;z=$0}'
> }
$ ./test.sh | sort -k2 | dupe -v k=2
242 bob
243 bob
292 sam
295 sam[/tt]
 
Ygor the function worked perfect thank you

%, 2004
 
You may also consider something like this:
./test.sh | awk '
{++n[$2];b[$2]=b[$2]$0"\n"}
END{for(i in n)if(n>1)printf b|"sort -k 2 -k 1"}
'

Hope This Helps, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
thank you all for your time -

%, 2004
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top