Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Removing unwanted blank charachter from scalar 1

Status
Not open for further replies.

ljsmith91

Programmer
May 28, 2003
305
US
I have a scalar and there is an unwanted blank charachter sitting at the end of it. It was pulled from an input file and as I read in the file I have to grab specific data and place in a scalar to process but each time it's pulled it has this unwanted charachter. Torjanwarblade provided this nice regex to determine the unwanted charachter:

$item =~ s/(\W)/"<".ord(qq:$1:).">"/ge;
print "$item\n";

$item prints out as:

data<160>
data<160>

So the unwanted charachter is a 160, I take it hex 160 and possibly a &nbsp; ??

Anyway, I have tried dozens of ways to eliminate this without success including:

$item =~ s/[\r\n]?$//;
$item =~ s/[\x*\f*\e*\t*]$//g;
$item =~ s/\x160$//g;

Is there a way to utilize the code Trojan provided, capture the unwanted and remove, whether it's 160 or some other unwanted ?

Thanks fo any help.





 
Silly me...this will do it:

$item =~ s/\W$//g;

It works anyway.


 
Are you sure it's just not a simple space on the end?

$item =~ s/\s?$//g;



------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
KevinADC,

No, it was definitely not a space. It was TrojanWarBlades code that identified it as a <160> and when I looked that up it indicated a &nbsp; likely from an html doc where the input may have originated ? Anyway, the substitution process I used lost the unwanted charachter. Thanks for lending a hand and ear. -ljs
 
ord returns the numerical value of the first character in a string:

Code:
$foo = '&nbsp';
$bar = '&';
print ord $foo;
print "\n", ord $bar;

the above prints 38 two times.

If you print the compliment of 32:

Code:
print chr 32;

you get the & symbol;

If you print "chr 160" you get a blank space. If the characters were '&nbsp;' that would be visible in the html file and easy enough to verify by looking at the html source code.

So I really have no idea what 160 is referring to in the output of Trojanwarblades code unless it is just a space. Maybe someone else will have a suggestion.



------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
your regexp indicates that it is a space on the end, try my regexp and see if it does the same thing as yours, if not then I just do not know.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
KevinADC,

I did give it a try, and the unwanted charachter remained. The input file is ftp'd in from a number of different sources from customers who create from another unknown source. It is a space when viewed as a text file. Is it sound/ok to use the substitution routine I am using to lose any unwanted/ unprintable charachter at end of line? I can't argue with the results but I was only forced here after all logical means failed me. Thanks for replies.

-ljs
 
It is not logical to use your method because there might be cases where it removes characters you want to keep. What it does is removes any thing that is not in this range of characters:

A-Z a-z 0-9 _

So it could remove many things you may want to keep like:

! ? ( ) + = etc etc etc

I can't imagine it's not just a carraige return on the ends of the lines. Are you on Unix or Linux? Have you tried the dos2unix command?





------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Just a hint:

Code:
$item =~ s/\x160$//g

could not work, because after "\x" you need to give a hexadecimal value, not a decimal one.

160 decimal is A0 hexadecimal =>

Code:
$item =~ s/\xA0+$//o

should discard the characters in question (and no others).

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top