how to keep just none duplicate data 2

diera · Jul 25, 2011

Hi,

i found a lot of code to remove duplicate data. in my case, i would like to print only none duplicate data. any suggestion how to do it?

example data.
aaaaaaaaaaa
bbbbbbbbbbb
bbbbbbbbbbb
ccccccccccc
ddddddddddd
ddddddddddd

the desire output:
aaaaaaaaaaa
ccccccccccc

any help is much appreciated.

thank you.

diera · Jul 25, 2011

Hi,

i just found the solution. Herewith the code by tadmc which is i've make some minor modification.

Code:

#!/usr/bin/perl

use strict;
use warnings;

my $file = '/perlscript/tweet/notbroadcast.txt';

open MYFILE, ">noneduplicate.txt";
select MYFILE;

local @ARGV = ($file);
my %lines;
while (<>) {
    $lines{$_}++;
}
print sort grep $lines{$_} == 1, keys %lines;

diera · Jul 25, 2011

Hi all,

i have tried using this code to just print out none duplicate data.

Code:

#!/usr/bin/perl

use strict;
use warnings;

my $file = '/perlscript/tweet/notbroadcast.txt';

open MYFILE, ">noneduplicate.txt";
select MYFILE;

local @ARGV = ($file);
my %lines;
while (<>) {
    $lines{$_}++;
}
print sort grep $lines{$_} == 1, keys %lines;

Result
aaaaaaaaaaa
ccccccccccc
dddddddddddddddddddddd

actually its wrong. the real output is
aaaaaaaaaaa
ccccccccccc

where the error actually? thank you for your help.

feherke · Jul 25, 2011

Hi

I just aplied abit of KISS principle and this works for me :

Code:

perl -ne '$l{$_}++;END{print sort grep $l{$_}==1,keys%l}' /perlscript/tweet/notbroadcast.txt > noneduplicate.txt

Feherke.

http://free.rootshell.be/~feherke/

feherke · Jul 25, 2011

Hi

Got it. The last line is missing the end of line mark. To handle them all equally either

Code:

[gray]# remove all \n then put them back when printing[/gray]
perl -ne '[highlight]chomp;[/highlight]$l{$_}++;END{print [highlight]join("\n",[/highlight]sort grep$l{$_}==1,keys%l[highlight]),"\n"[/highlight]}' /perlscript/tweet/notbroadcast.txt > noneduplicate.txt

[gray]# or just add them when missing[/gray]
perl -ne '[highlight]$_.="\n"if substr($_,-1)ne"\n";[/highlight]$l{$_}++;END{print sort grep$l{$_}==1,keys%l}' /perlscript/tweet/notbroadcast.txt > noneduplicate.txt

Feherke.

http://free.rootshell.be/~feherke/

prex1 · Jul 26, 2011

A different approach:

Code:

while(<MYFILE>) {
  chomp;
  if(exists $lines{$_}){
    delete $lines{$_};
  }else{
    $lines{$_}=1;
  }
}
print join"\n",sort keys %lines;

Franco

http://www.xcalcs.com

: Online engineering calculations

http://www.megamag.it

: Magnetic brakes for fun rides

http://www.levitans.com

: Air bearing pads

diera · Jul 26, 2011

Thanks all for your kindly help.

feherke · Jul 26, 2011

Hi

Franco, that is more efficient, but will work only for even multiplications.

( I mean will work for "[red]a[/red][small][gray]\n[/gray][/small][green]b[/green][small][gray]\n[/gray][/small][green]b[/green][small][gray]\n[/gray][/small][blue]c[/blue]" and "[red]a[/red][small][gray]\n[/gray][/small][green]b[/green][small][gray]\n[/gray][/small][green]b[/green][small][gray]\n[/gray][/small][green]b[/green][small][gray]\n[/gray][/small][green]b[/green][small][gray]\n[/gray][/small][blue]c[/blue]" but not for "[red]a[/red][small][gray]\n[/gray][/small][green]b[/green][small][gray]\n[/gray][/small][green]b[/green][small][gray]\n[/gray][/small][green]b[/green][small][gray]\n[/gray][/small][blue]c[/blue]" and "[red]a[/red][small][gray]\n[/gray][/small][green]b[/green][small][gray]\n[/gray][/small][green]b[/green][small][gray]\n[/gray][/small][green]b[/green][small][gray]\n[/gray][/small][green]b[/green][small][gray]\n[/gray][/small][green]b[/green][small][gray]\n[/gray][/small][blue]c[/blue]". )

Feherke.

http://free.rootshell.be/~feherke/

prex1 · Jul 26, 2011

Correct Feherke, it doesn't work [sad]

Franco

http://www.xcalcs.com

: Online engineering calculations

http://www.megamag.it

: Magnetic brakes for fun rides

http://www.levitans.com

: Air bearing pads

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

how to keep just none duplicate data 2

diera

Programmer

diera

Programmer

diera

Programmer

feherke

Programmer

feherke

Programmer

prex1

Programmer

diera

Programmer

feherke

Programmer

prex1

Programmer

Similar threads

Part and Inventory Search

Sponsor