Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Pattern Matching Expression (HTML tag) 1

Status
Not open for further replies.

Extension

Programmer
Nov 3, 2004
311
CA
I need some help to come up with a pattern matching expression. I need to process some HTML content and change one specific HTML tag: <input>

So I open the HTML file and place the content into an array. This is part of the array (HTML content) I need to process.

So I want to change this (example):

Code:
<input type="text" name="state" size="32" maxlength="32" value="" />

So I want to change this
Code:
${state}

Process:
1. Grab the input name ("state").
2. Strip the entire <input> tag (from "<input" to "/>".
3. Echo a string that will have the same name as the input name (input name="state" to $state). P.S. The matching strings are all defined.

Any help would be really appreciated.
 
Is this what you need?

$NewYork = 'NewYork';
$html_tag= '<input type="text" name="NewYork" size="32" maxlength="32" value="" />';

($state = $html_tag) =~ s/^.*\s*name=['"]([^'"]+).*$/$1/io;

print "state=$$state\n"; # Will print "state=NewYork"
 
Thank you dkyrtata for the help. Really appreciated.

Your Regex does work but the HTML content I'm trying to process has many <> tags in it.

What I'm trying to do is to "echo" the value of the following variables (could be more variables) in the HTML content.

Here are the variables to insert in the HTML content where the <input> name value matches the variable name.

Code:
$city = "Seattle";
$state = "Washington";
$country = "US";


Original HTML content
Code:
<html xmlns="[URL unfurl="true"]http://www.w3.org/1999/xhtml">[/URL]
<head>
  <title>title</title>
</head>

<body>

<strong>City:</strong>
<input type="text" name="city" size="32" maxlength="48" value="" />

<strong>State:</strong>
<input type="text" name="state" size="32" maxlength="32" value="" />

<strong>Country:</strong>
<input type="text" name="country" size="32" maxlength="64" value="" />

</body>
</html>


Output once processed with Regex
Code:
<html xmlns="[URL unfurl="true"]http://www.w3.org/1999/xhtml">[/URL]
<head>
  <title>title</title>
</head>

<body>

<strong>City:</strong>
$city

<strong>State:</strong>
$state

<strong>Country:</strong>
$country

</body>
</html>


Final output
Code:
<html xmlns="[URL unfurl="true"]http://www.w3.org/1999/xhtml">[/URL]
<head>
  <title>title</title>
</head>

<body>

<strong>City:</strong>
Seattle

<strong>State:</strong>
Washington

<strong>Country:</strong>
US

</body>
</html>

I hope I didn't make this look complicated.

Any help would be appreciated.
 
Ok, try this:

#!/bin/perl

@html=(
q!<html xmlns=" ,q!<head>!
,q! <title>title</title>!
,q!</head>!
,q!!
,q!<body>!
,q!!
,q!<strong>City:</strong>!
,q!<input type="text" name="city" size="32" maxlength="48" value="" />!
,q!!
,q!<strong>State:</strong>!
,q!<input type="text" name="state" size="32" maxlength="32" value="" />!
,q!!
,q!<strong>Country:</strong>!
,q!<input type="text" name="country" size="32" maxlength="64" value="" />!
,q!!
,q!</body>!
,q!</html>!
);

$city = "Seattle";
$state = "Washington";
$country = "US";

foreach $html_tag(@html) {
if(($location_type = $html_tag) =~ s/^.*\s*name=['"]([^'"]+).*$/$1/io) {;
print "$$location_type\n";
}
else {
print "$html_tag\n";
}
}
 

Thanks a lot dkyrtata

Really appreciated. I really have to improve my RegEx skills.

How complicated to make the RegEx only accept <input name="xxxx"> or <select name="xxxx">. This way, it would ignore any other tags with "name" in it, such as <meta name="xxxx">.

Thanks again.

 
If you only want to accept "input name" or "select name" with any number of spaces between words, just do this:

html_tag =~ s/^.*\s*(input|select)\s+name=['"]([^'"]+).*$/$2/io)

But since I added another pair of brackets, I had to change the $1 reference to $2

Regex becomes easier as you use them. But I find its a lot easier to create them than to decipher them in the future.


 
Thanks again dkyrtata for the help.

I just realized my last post wasn't really accurate.

When I said I would like to only accept <input> and <select>, I didn't tell you there's could be some stuff before and after the name="xxx".
Example:
Code:
<input type="text" name="state" size="32" maxlength="32" value="" /> 
<select name="state" size="32" maxlength="32" value="" />

I believe the new RegEx will only support the following format:
Code:
<input type="text" name="state"  />


Still, I should be able to modify your RegEx to make it work... I really need to learn more about RegEx !

Thanks again for you time and your expertise.
 
Rather than complicate the existing regular expression, you can simply add another one with the 2 conditions you mention. Only when one of those conditions are true, do you proceed to the original regex.

So just replace the if condition I originally gave you with this one:

if($html_tag =~ m/<((input\s+type)|(select\s+name))=/io && ($location_type = $html_tag) =~ s/^.*\s*name=['"]([^'"]+).*$/$1/io) {


Does that give you what you need?
 
Thanks again dkyrtata

Everything is working fine. Thank you again for the help. Really appreciated.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top