Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations biv343 on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

DB rows wth html -> XML -> HTML

Status
Not open for further replies.

goBoating

Programmer
Feb 8, 2000
1,606
US
Using Perl

DB query->XML->HTML

We're building a prototype of a system to pull product
information from a rdbms, build an XML document from that
data, and finally use a series of XSLTs to transform the
XML doc into a series of web pages.

The code below ends before we get to the XSLT trick (which works).

- The problem -
Some of the data coming from the database contains some html
pieces/parts. When I use the XML::LibXML::Text->new to
build an object of the text/html, the 'Text->new' method converts all of the HTML angle brackets into
Code:
'ampersand lt;'
and
Code:
'ampersand gt;'
.

Consequently, we loose the ability to discriminate between
previously existing '<' chars and those which are the result of pulling the text/html through the 'Text->new' method.

Looking for ideas/strategies to pull the text/html and add it to the XML doc, later to be processed by the XSLTs.

I have tried pulling the data through the the XML::LibXML->parse_html_string but get a core dump when I try to appendChild to the partially built document.

#!/usr/bin/perl
use strict;
use DBI;
use XML::LibXML;
use XML::LibXSLT;

my $dbh;
&db_connect; # connect to db (MS SQL Server)

my $parser = XML::LibXML->new();
my $xslt = XML::LibXSLT->new();

# start a new XML document
my $xml_doc = XML::LibXML::Document->new();
$xml_doc->setEncoding('UTF-8');

# main enclosing element
my $prod_report = XML::LibXML::Element->new('prod_report');
$xml_doc->setDocumentElement($prod_report);

# get data from database - an array or db rows
my @prod_data = &db_query('SELECT ProductId, Name, description, URL
FROM Products
order by ProductId');

my $products = XML::LibXML::Element->new('products');
$prod_report->appendChild($products);

foreach my $ref (@prod_data)
{
my $product = XML::LibXML::Element->new('prodid');
$product->setAttribute('id',$$ref[0]);

my $name = XML::LibXML::Element->new('name');
my $name_text = XML::LibXML::Text->new($$ref[1]);
$name->appendChild($name_text);

my $desc = XML::LibXML::Element->new('desc');

# Need to handle incoming text containing some HTML chunks.
my $desc_text = XML::LibXML::Text->new($$ref[2]);
$desc->appendChild($desc_text);

$product->appendChild($name);
$product->appendChild($desc);

$products->appendChild($product);
}

my $output_string = $xml_doc->toString;
print ($output_string);


#--------------------------------------------------------------------
# db_connect
#--------------------------------------------------------------------
sub db_connect
{
$dbh = DBI->connect(&quot;dbi:Sybase:server=servername&quot;, 'user', 'pass', {PrintError => 0});
unless ($dbh) { die &quot;Unable for connect to server $DBI::errstr&quot;; }
$dbh->do('use db_name');
}

#--------------------------------------------------------------------
# db_query
#--------------------------------------------------------------------
sub db_query
{
my $sql = shift;
my @data;
my $sth = $dbh->prepare(&quot;$sql&quot;);
if($sth->execute)
{ while(my @row = $sth->fetchrow) { push @data, [ @row ]; } }
else { print &quot;ERROR - $DBI::errstr\n&quot;; }
$sth->finish;
return(@data);
} 'hope this helps

If you are new to Tek-Tips, please use descriptive titles, check the FAQs, and beware the evil typo.
 
A perl solution to the above problem....
Wrap the incoming text with html chunks in a
pair of 'div' tags to fake the 'parse_html_string'
method into thinking it is operating on html and
not text.

Code:
#!/usr/bin/perl
use strict;
use CGI;
use CGI::Carp 'fatalsToBrowser';
use CGI::Carp 'warningsToBrowser';
use DBI;
use XML::LibXML;
use XML::LibXSLT;

# connect to sql server 
# create a database handle
my $dbh;
&db_connect;

# set scoping for global vars.
my $xml_doc;
my $parser;
my $xslt;
my $prod_report;

# start the CGI/HTML stuff.
my $cgi = new CGI;
print $cgi->header; #  , $cgi->start_html;
# print $cgi->Dump; # debugger


# catch list of requested pages to build.
my @bins    = $cgi->param('bins');
my @themes  = $cgi->param('themes');
my $prods   = $cgi->param('prods');

# get content from database for requested pages and build xml chunks.
# foreach requested new html page, run 'build_html'.

#--------------------------
# DO BINS
#--------------------------
if ($bins[0])
    {
    $xml_doc = XML::LibXML::Document->new();
    $parser = XML::LibXML->new();
    $xslt   = XML::LibXSLT->new();
    $prod_report = XML::LibXML::Element->new('prod_report');
    $xml_doc->setDocumentElement($prod_report);

    my %bin_hash;
    # sql has been sanitized to guard our systems.
    my @bin_data = &db_query('SELECT field1, field2, field3, etc
           from table1, table2, table3
           where table1.field1 = table2.field1
	       and table2.field2 = table3.field4
           order by id, name');

    my $bins = XML::LibXML::Element->new('bins');
    $prod_report->appendChild($bins);

    my $bin_node;
    my $prev_bin = '';
    foreach my $ref (@bin_data)
    	{
	    # print &quot;<p>$$ref[0] <br /> $$ref[1] <br /> $$ref[2]</p>\n&quot;;
	    my $bin = $$ref[0];
	    if ($prev_bin ne $bin)
	    	{
    		$bin_node = XML::LibXML::Element->new('bin');
	    	$bin_node->setAttribute('id',$bin);
	    	my $bin_desc = XML::LibXML::Element->new('bin_desc');
            $$ref[1] = &fix_chars($$ref[1]);
		    my $desc_text = XML::LibXML::Text->new($$ref[1]);
	    	$bin_desc->appendChild($desc_text);
	    	$bin_node->appendChild($bin_desc);
	    	$bins->appendChild($bin_node);
	    	}
        $$ref[3] = &fix_chars($$ref[3]);
        $$ref[4] = &fix_chars($$ref[4]);
        my $prod_ref = &build_prods($$ref[3], $$ref[4], $$ref[5]);
        $bin_node->appendChild($prod_ref);
	    $prev_bin = $bin;
	    }

    foreach my $bin (@bins)
        {
	    # print &quot;<p>Requesting bin: $bin </p>\n&quot;;
        &build_html('bin','bin.xsl',$bin);
        }
    }

#--------------------------------------------------------------------
#                           
#--------------------------------------------------------------------
if ($themes[0]) { 
# contents of sub cut for brevity
# do similar for themes as was done for bins
# build xml document and then pass to build_html to output pages.
}
#--------------------------------------------------------------------
#                           
#--------------------------------------------------------------------
if ($prods) { 
# contents of sub cut for brevity
# do same for prods as was done for bins
# build xml document and then pass to build_html to output pages.
}
#--------------------------------------------------------------------
#                           build_html
#--------------------------------------------------------------------
sub build_html
{
#   [bin|theme|program], [style sheet] , item id for [bin|theme|program]
my ($type,$style,$item) = @_;
print &quot;<p>Building: $type <br />$style<br />$item</p>\n&quot;;

# apply xslt to xml object;
my %xslt_input;
my $style_doc  = $parser->parse_file(&quot;./xslt/$style&quot;);
my $stylesheet = $xslt->parse_stylesheet($style_doc);
if ($item) { $xslt_input{'itemId'} = $item; }
my $results = $stylesheet->transform($xml_doc, %xslt_input);
my $output =  $stylesheet->output_string($results);

my $output_file = ($type eq 'program') ? $type.'.html' : $type.'_'.$item.'.html';
print $output;
open(HTML,&quot;>./html/$output_file&quot;) or die &quot;Failed to open XML for pretty out, $!\n&quot;;
print HTML &quot;$output&quot;;
close HTML;
}

#--------------------------------------------------------------------
#                           fix_chars
#--------------------------------------------------------------------
sub fix_chars
{
my $buffer = shift;
$buffer = XML::LibXML::encodeToUTF8('ISO-8859-1',&quot;$buffer&quot;);
return($buffer);
}

#--------------------------------------------------------------------
#                           db_connect
#--------------------------------------------------------------------
sub db_connect
{
$dbh = DBI->connect(&quot;dbi:Sybase:server=db_server&quot;, 'user', 'pass', {PrintError => 0});
unless ($dbh) { die &quot;Unable for connect to server $DBI::errstr&quot;; }
$dbh->do('use db_name');
}

#--------------------------------------------------------------------
#                           db_query
#--------------------------------------------------------------------
sub db_query
{
my $sql = shift;
my @data;
my $sth = $dbh->prepare(&quot;$sql&quot;);
if($sth->execute)
	{ while(my @row = $sth->fetchrow) { push @data, [ @row ]; } }
else { print &quot;ERROR - $DBI::errstr\n&quot;; }
$sth->finish;
return(@data);
}

#--------------------------------------------------------------------
#                           build_prods
#--------------------------------------------------------------------
sub build_prods
{
# name, desc, url
my @array  = @_;
my $product = XML::LibXML::Element->new('prod');

my $name = XML::LibXML::Element->new('name');
my $name_text = XML::LibXML::Text->new($array[0]);
$name->appendChild($name_text);
$product->appendChild($name);


my $desc = XML::LibXML::Element->new('desc');
[red]
Code:
# the product description may have some embedded
# chunks of HTML. Using the 'Text' method (see previous post)
# converted all angle brackets into > or <
# thus corrupting the HTML.  
# The 'parse_html_string' method (below) expects html
# and failed on plain text.  So, we simply wrap the 
# text in a pair of <div> tags and thus faking
# the 'parse_html_string' method into working.
my $html = '<div>'.$array[1].'</div>';
my $tmp_doc = $parser->parse_html_string($html);
my $tmp_root = $tmp_doc->getDocumentElement;
my $div_node = $ { $tmp_root->getElementsByTagName('div') }[0];
$desc->appendChild($div_node);
[/red]
Code:
$product->appendChild($desc);

my $url = XML::LibXML::Element->new('url');
my $url_text = XML::LibXML::Text->new($array[2]);
$url->appendChild($url_text);
$product->appendChild($url);

return($product);
}

#--------------------------------------------------------------------
#                           xml_dump
#--------------------------------------------------------------------
sub xml_dump
{
open(DUMP,&quot;>xml_dump.xml&quot;) or die &quot;Failed to open dump, $!\n&quot;;
my $string = $xml_doc->toString();
print DUMP &quot;$string\n&quot;;
close DUMP;
return();
}
'hope this helps

If you are new to Tek-Tips, please use descriptive titles, check the FAQs, and beware the evil typo.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top