Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Perl script - remove duplicates from text file.

Status
Not open for further replies.

dfezz1

Technical User
Jun 17, 2009
4
US
Hello,

I am trying to modify a small script I found online.
It's a great script that test for and then removes duplicates. The issue is I have to manually edit the script to point to the file needing dupes removed. So what I want to do is add some keyboard / <STDIN> interaction.

I want to prompt for the file location and maybe even the output loaction and filename. I have added the print statment (which works) but the rest of the script dies once I enter the file location

Here is the script:

==========================================
Code:
#!/usr/bin/perl

print "Type in the location of the file you what to remove duplicates: \n";

$file = <STDIN>;

my %seen = ();
{   
        local @ARGV = ($file);   
        local $^I = '.bak';   
        while(<>){      
                $seen{$_}++;      
                next if $seen{$_} > 1;
                print;   
        }
}

print "finished processing file\n";
=========================================
 
You need to remove the line terminator from the input from the user.

change:
$file = <STDIN>;

to:
chomp($file = <STDIN>);
 
You can take the input and/or output file from the command line arguments with @ARGV.

Code:
my $input = shift(@ARGV) or die "Usage: rmdupes <infile> [outfile]";
my $output = shift(@ARGV) || $input;

then if ya ran it like "rmdupes myfile.txt out.txt" it'd put those into $input and $output, but "rmdupes myfile.txt" should have both $input and $output be "myfile.txt"

Cuvou.com | My personal homepage
Code:
perl -e '$|=$i=1;print" oo\n<|>\n_|_";x:sleep$|;print"\b",$i++%2?"/":"_";goto x;'
 
Or, for a more robust approach, you could use one of the getopt modules to use command line switches and execute your script like this:

myscript.pl -i <input filename> -o <output filename>
 
Thanks

Everyone I will try these ideas out and see which one works best for me.

Thanks again!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top