How do I parse XML tag contents ?? 1

DR4296 · Feb 10, 2001

Greetings All !

I've always found pattern matching a bit hard to do / remember.

I'm trying to grab an error message out of an XML page and I'm not sure how to tell perl to... "grab everything between the <ERROR> and </ERROR> tags".

From everything I've located using lookups at Google, I realize that this is not a good way to do it. I should probably use that XML:

arser module. But I'm only dealing with one line of code here. I've already got lines in place to pull the "success" data I want from the XML pages (pricing data, actually). But I admit that I cut-and-pasted this from elsewhere and I don't really grasp the syntax.

Can anybody tell me how to "grab everything" between a set of XML tags ?

Thanks !

-= Dave Raasch =-

stillflame · Feb 11, 2001

like this:[tt]

$data =~ m~<ERROR>(.*?)</ERROR>~m ;

[/tt]
it then assigns the part in parenthesis to the variable $1, which you can then play with as you see fit. note, the 'm' at the end means multi-line data, and isn't the same as the 'm' at the front which is just for match. if the error text only ever occurs on one single line with the error tags, you can remove the trailing 'm'.
also, this presupposes that the xml file has been read into a single string. for other data structures, the code would be alot different. "If you think you're too small to make a difference, try spending a night in a closed tent with a mosquito."

dshaw21369 · Aug 21, 2002

This is a great example and Im able to use it for the 1st error found in xml, but Im having trouble finding 1 or more.. Any suggestions????

justice41 · Aug 21, 2002

A slight modification of stillflame's code

Code:

my @errors = $data =~ m~<ERROR>(.*?)</ERROR>~sg;
@errors = map { tr/\n/ /; $_ } @errors; # git rid of pesky newlines.

the /s forces '.*' to match newlines and the /g means to find all matches in the string. m//g used in list context returns a list of captured matches.

jaa

dshaw21369 · Aug 21, 2002

Just what I was looking for Thanks so much!!!

dshaw21369 · Nov 28, 2002

Does anyone know how I could parse this:
<SECT ID="31128"></SECT>

My script is returning blanks when I do this:
I want to get the ID value.
thanks

foreach $snapshot (@XML_In_TASK_Phase1_messages)

{

@snapshot=$snapshot =~ m~<SECT ID=>(.*?)</SECT>>~sg;
print Debug " @snapshot \n";

@CES_snap=(@CES_snap,@snapshot);

}

dshaw21369 · Nov 29, 2002

I tried this but still did not work

@snapshot=$snapshot =~ m~<SECT ID="(.*?)"/>~sg;

dummy33 · Nov 29, 2002

try this:

m~<SECT ID="(.*?)"></SECT>~sg

brt · Jan 21, 2003

Hello, Jaa and others!

Based on the above thread, I worked out the following:

sub replacepattern {
foreach $source (@filesfound) {
undef @newfile;
open (FILE,"<$source&quot

|| die "Can't open $source: $!.";
@file = <FILE>;
close FILE;
foreach $line (@file) {
foreach $string (@tags) {
($start,$end) = split(' ',$string);
if ($line =~ m~$start(.*?)$end~sg) {
print "innerpart: ", $1, "\n";
$line =~s/$1/NEW_STRING/g;
print "whole line: ", $line, "\n";
}
}
push @newfile,$line;
}
open (FILE,">$source&quot

|| die "Can't open $source: $!.";
print FILE @newfile;
close FILE;
}
}

Although I tried to work out the suggestions from Jaa and the others, the above works only for ONE match withing the SAME line. That is:
<starttag> string to be replaced<endtag>
...works ok.

However, this one:
<starttag> string1 replaced<endtag><starttag> string2 replaced<endtag>
...and this one:
<starttag> st
ring to
be repla
ced
<endtag>
...do not work. Only one match per line is replaced.

Any ideas how to fix this? Many thanks in advance for your ideas!

-brt

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

How do I parse XML tag contents ?? 1

DR4296

Technical User

stillflame

Programmer

dshaw21369

Programmer

justice41

Programmer

dshaw21369

Programmer

dshaw21369

Programmer

dshaw21369

Programmer

dummy33

Technical User

brt

Programmer

Similar threads

Part and Inventory Search

Sponsor