Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Removing Leading and Trailing ' " ' Characters

Status
Not open for further replies.

tbohon

Programmer
Apr 20, 2000
293
US
I need to read through a text file and examine each line. If the leading and/or trailing character is a double-quote I need to truncate those characters. However, there are double-quotes within the line which are valid and can't be changed.

I'm not looking for a solution, only some pointers - I learn better that way.

I have the script so that the current line of the file is in line$ but I'm not sure how to find the (leading or trailing only) double-quotes. My plan, once I figure that out, is to calculate the length of the string and then use the substr command to pick all but those leading/trailing characters.

Thoughts? Hints?

Tnx in advance.

Tom

"My mind is like a steel whatchamacallit ...
 
$line =~ s/^"//;
$line =~ s/"$//g;

I'm sure someone can do better than that but it can give you something to look at. ^ is the start of a line, $ is the end, s/// is substitution.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
Building on Travis' solution
Perl:
s/^"(.*)"$/$1/ while (<>);
(untested...)

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object::perlDesignPatterns)[/small]
 
as far as the trailing character concerns in a text file, most of the times will be the <enter> character which is a \n on *nix files and \r\n on windows files
In both cases these characters are invisible.

so in order to remove the last visible character (which is a " in your case), in your lines loop

Code:
chomp $line;
$line =~ s/"$//g; (as [b]travs69[/b] suggested)
or
Code:
$line =~ s/"\r?\n$//g;



``The wise man doesn't give the right answers,
he poses the right questions.''
TIMTOWTDI
 
That is what I needed - well, that plus a night's sleep to ponder the problem ... <g>

Thanks all - it's working great now.

Best,

Tom

"My mind is like a steel whatchamacallit ...
 
No I don't think you need to use chomp to remove the trailing character that comes before the end of line character:

Code:
use strict;
use warnings;
while (my $line=<DATA>){
   $line =~ s/^"//;
   $line =~ s/"$//g;
   print $line;
}

__DATA__
"this is a test"
"this is a test"
"this is a test"
"this is a test"

output:

this is a test
this is a test
this is a test
this is a test

The '$' anchor matches before the end of line character. If you use chomp you need to add the end of line character back onto the end of the string.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Just wondering if

Code:
$line =~ s/^"|"$//g;

is the same as

Code:
$line =~ s/^"//;
$line =~ s/"$//g;
 
The second one is faster. Otherwise the two versions will leave the same result in $line (keeping in mind that variables like $& $' and $` (and others) will be affected differently). This is also referenced in one of the FAQs from KevinADC about removing white space.
 
I guess it should be said that the "g" modifier is not necessary, it might be confusing the OP thinking it has to be there.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Good point about the 'g' modifier. It in fact brings to light an oversight on my part. Consider
# Starting with
$line = '"""<data>"""';
Using this Results in
1) $line =~ s/^"|"$//g; <data>

2) $line =~ s/^"//g; <data>"""
$line =~ s/"$//g; <data>

3) $line =~ s/^"//; ""<data>"""
$line =~ s/"$//; ""<data>""

So technically, the two code references from the previous max1x posting will not always produce the same result.
 
good point

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
PinkeyNBrain I think you should give your examples a try...
as long as you have ^whatever or whatever$ in a regex...no matter what you use ( g,s,m ) it will always much only the whatever part...

Code:
# Starting with
$line = '"""<data>"""';   # lets say <data> is equal to [blue]foo " foo[/blue]

Using this              Results in
1) $line =~ s/^"|"$//g; [blue]""foo " foo""[/blue]  not [red]foo " foo[/red]    [blue]""<data>""[/blue] not [red]<data>[/red]

2) $line =~ s/^"//g;    [blue]""foo " foo"""[/blue] not [red]foo " foo"""[/red] [blue]""<data>""[/blue] not [red]<data>"""[/red]
   $line =~ s/"$//g;    [blue]""foo " foo""[/blue]  not [red]foo " foo[/red]    [blue]""<data>""[/blue] not [red]<data>[/red]

3) $line =~ s/^"//;     [blue]""foo " foo"""[/blue]   [blue]""<data>"""[/blue]
   $line =~ s/"$//;     [blue]""foo " foo""[/blue]    [blue]""<data>""[/blue]
unless I misunderstood your example...my apologies if I did so.


``The wise man doesn't give the right answers,
he poses the right questions.''
TIMTOWTDI
 
man I can't believe a little g I slipped in there has caused this much discussion.. [afro]

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
Using this Results in
1) $line =~ s/^"|"$//g; ""foo " foo"" not foo " foo ""<data>"" not <data>
... and in the wee hours of the morning I'm asking myself "Why am I spending so much time debugging this simple code?"

My mistake. For years I've used
Code:
$input =~ s/^\s+|\s+$//g;
to clean up / trim incoming data. Seeing the question as written I did a mental cut-n-paste and churned out the wrong stuff. Having since read KevinADC's FAQ about splitting the s/// function up across two lines, I've now changed my ways. Granted, when working with user input or small files saving that handful of clock cycles will go unnoticed. But for the sake of good coding habits. Sooo, this puts me back to my Jul-2 reply being about as good as I had to offer with KevinADCs modification.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top