Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations strongm on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Convert consecutive line breaks into a single space

Status
Not open for further replies.

Kirsle

Programmer
Jan 21, 2006
1,179
US
Hey.

I'm still doing some work on my Tk::HyperText module, but there's a little thing I'm having trouble with on it: converting multiple line breaks into a single space.

For example, in this HTML code...
Code:
<b>The quick brown fox
jumps over the lazy dog.</b>

The page should display "The quick brown fox jumps over the lazy dog." in a real web browser, because the line break gets converted into a space.

However, when I try to run a regexp like this:

Code:
$sector =~ s/\x0a+/ /ig;

It converts every Lf into a space. So that, with this HTML code:

Code:
<html>
<head>
<title>HyperText Demonstration</title>
</head>
<body bgcolor="#EEEEFF" link="#0000FF" vlink="#FF00FF" alink="#FF0000" text="#000000">

<h1>Tk::HyperText Demonstration</h1>

The page displays with 5 or 6 spaces on the top line, then the <h1> code. At most it should only have 1 space on the top line.

Here's my relevant bit of code:

Code:
		# If this was preformatted text, preserve the line endings and spacing.
		if ($style{pre} == 1) {
			# Leave it alone.
		}
		else {
			$sector =~ s/\x0d//smg;
			$sector =~ s/\x0a+//smg;
			$sector =~ s/\s+/ /sg;
		}

Help would be appreciated. :)

-------------
Cuvou.com | My personal homepage
Project Fearless | My web blog
 
Greetings Kirsle,

Is there any reason why you can't just let the \s meta character capture all your explicit returns?

Code:
[url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]$str[/blue] = [red]qq{[/red][purple]<b>The quick brown fox\njumps over the lazy dog.</b>[/purple][red]}[/red][red];[/red]

[blue]$str[/blue] =~ [red]s/[/red][purple][purple][b]\s[/b][/purple]+[/purple][red]/[/red][purple] [/purple][red]/[/red][red]g[/red][red];[/red]

[url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [red]"[/red][purple][blue]$str[/blue][/purple][red]"[/red][red];[/red]

[gray][i]# Outputs: <b>The quick brown fox jumps over the lazy dog.</b>[/i][/gray]

- Miller
 
Why ask why? Just use the following code to get rid of the extra line breaks:

[highlight]chomp($sector);[/highlight]

If you really need to ask why, without looking at the rest of your code, a good guess would be that a visitor to your website would have to type enter or return to have the program accept the input. Therefore, you'll expect just one line break, but get 2 instead due to your visitors hitting the enter or return button.

Shameless plug
I've answered your question, now please answer mine:

[highlight][/highlight]

Have a nice day,
Lilly [sunshine]
 
Actually, you didn't answer my question, latigerlilly. ;-) MillerH gave a better answer than you did.

MillerH: your code worked a lil better than mine did. I still ran into a big of a problem with the spaces at the beginning of the page (which I've concluded is because of the line breaks around <html>, <head>, <title>, etc), but I just added a bit of code to set $sector="" if it only contains white spaces.

But, as for having newlines in the middle of a paragraph, your code converts them correctly so that each newline in the paragraph becomes a space. A small minor bug yet is that the first line of a new paragraph has one space in the front of it. It's not too big of a deal but I'll keep working at it.

latigerlilly: I wasn't talking about CGI or anything to do with user input, I was talking about HTML rendering, as I'm programming this module:
-------------
Cuvou.com | My personal homepage
Project Fearless | My web blog
 
Hi Kirsle,

I think the biggest improvement that you could make to your module would be to revamp the html rendering. Instead of rolling your own parser, I would advise you to use HTML::TokeParser or some other related engine.


I've made my own markup language before, and while it can be fun creating your own rendering engine, there are too many special cases to make doing everything yourself very effective. I'm not sure which engine would work best for what you're trying to do, or what specifications that you would want to apply. Maybe only support html snippets? As in fail on <html> and <body> and <title> tags. But whatever spec you come up with, it's going to be easier if you use someone's already existing code.

- Miller
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top