Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SED remove extraneous carriage returns 1

Status
Not open for further replies.

mdtimo

Programmer
Oct 18, 2001
38
0
0
US
I am attempting to remove all carriage returns from a file when the carriage return is not proceeded by [].

My attempt below failed when trying to create Myfile2, it returned the message input line too long. It works on a smaller file.

tr -d '\r\n' < D:\elite\custom\ebill\Micron\Myfile.txt > D:\elite\custom\ebill\Micron\Myfile2.txt

sed -e 's/\[]/\[]~/g' <D:\elite\custom\ebill\Micron\Myfile2.txt > D:\elite\custom\ebill\Micron\Myfile3.txt

tr -s ~ \13\ < D:\elite\custom\ebill\Micron\Myfile3.txt > D:\elite\custom\ebill\Micron\Myfile4.txt
 
What utilities are you using here, Cygwin? I tried with Cygwin tr and it seems to work fine on a 1.3MB text file.



Annihilannic.
 
Try this perhaps:

[tt]perl -ne 's/\n//sg; s/([[][]])/\n$1/sg; print;'[/tt]

I find perl more useful for handling multi-line expressions.

Annihilannic.
 
Where in the perl script do I define the input and output files.

When I try to run that script I get the message aborted due to compilation errors.

I am not even sure I am running the script correctly. I'd prefer to use sed because I know how to get it to execute.
 
You didn't answer my question about the type/version of Unix utilities you are using here?

perl behaves very similarly to sed when you use it in -n mode.

Apologies, I accidentally posted the Unix version of my test script, however it shoudln't have resulted in a compilation error regardless. Here it is corrected for Windows and showing that it *should* work for you:

[tt]C:\TEMP>type t.txt
moja
[]mbili
tatu
nne
[]tanu
sita

C:\TEMP>d:\programs\cygwin\bin\perl -ne 's/\r\n//sg; s/([[][]])/\r\n$1/sg; print;' t.txt > t2.txt

C:\TEMP>type t2.txt
moja
[]mbilitatunne
[]tanusita
C:\TEMP>[/tt]

Annihilannic.
 
I am using MKS toolkit version 8.6.

I mistated on what I wanted the script to do.

I have a file that looks like this.

abcde
fghij[]
klmnopqrst
uvw[]
xyz[]

I want it to look like this:

abcdefghij[]
klmnopqrstuvw[]
xyz[]

When I attempt to run your perl script I get the following:

Bareword found where operator expected at -e line 1, n
syntax error at -e line 1, near "s//r/n"
Bareword found where operator expected at -e line 1, n
(Missing operator before n?)
Execution of -e aborted due to compilation errors.

 
It sounds like the backslashes in my solution are somehow being changed into forward slashes; are you sure you copied it correctly, or is MKS doing some kind of automated translation I wonder? If it is, maybe you can disable that 'feature' somehow. Unfortunately I don't have access to a copy of MKS to test.

Just change s/([[][]])/\r\n$1/sg to s/([[][]])/$1\r\n/sg to correct it for the restated problem.

Annihilannic.
 
I found a construct that doesn't have the same problem with backslashes and doesn't error out. Below is the script I have.

$infile = "/elite/custom/ebill/micron/MyFilesmallx.txt" ;

$outfile = "/elite/custom/ebill/micron/MyOutPutFile.txt" ;


open(INFILE, "< $infile") ;

open(OUTFILE, "> $outfile") ;

while(<INFILE>) {

my($temp) = $_ ;

$temp =~ s/\n//sg; s/([[][]])/$1\r\n/sg;

print OUTFILE "$temp";

}


close OUTFILE ;



But,

It isn't doing what I want.

It changes

abcde
fghij[]
klmnopqrst
uvw[]
xyz[]

into
abcdefghij[]klmnopqrstuvw[]xyz[]

when I want

abcdefghij[]
klmnopqrstuvw[]
xyz[]

 
The following works in unix:
Code:
gawk 'BEGIN{RS=""}{gsub(/\n/,"");gsub(/\[]/,"[]\n");print}' MyFilesmallx.txt

Hope This Helps, PH.
FAQ219-2884
FAQ181-2886
 
PHV's solution works fine with gawk for Windows if you store the script in a file and use gawk -f phv.awk Myfilesmallx.txt, for example.

However try these adjustments to your perl script if you prefer:

Code:
$infile = "MyFilesmallx.txt";
$outfile = "MyOutPutFile.txt";

open(INFILE, "< $infile");
open(OUTFILE, "> $outfile");

while(<INFILE>) {
    my($temp) = $_;
    # need to replace \r\n here, not just \n
    $temp =~ s/\r\n//sg;
    # also need to assign the second replacement to $temp
    $temp =~ s/([[][]])/$1\r\n/sg;
    print OUTFILE "$temp";
}

Annihilannic.
 
Thanks. It worked without the \r. It took a 3500 line text file and removed the extraneous hard returns in seconds.

Thanks.
 
Good news. Perhaps the MKS toolkit version of perl automatically assumes \n is CR/LF.

Annihilannic.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top