"AJAX" for HTML, not XML

tystent · Nov 28, 2005

I've read through the forum threads about AJAX, explored Captain-AT's how-to's, and played with various examples from this site and others, but I'm unable to make a certain step in my application. A third party at my domain provides a database lookup form webpage (i.e., the URL has a single ? parameter that they look in their database and take the database response and format it into a simple webpage. But I want to extract the actual data items from that page (each one is in its own <table><tr><td> element) in JS and format them my own way within my own page. I can get the HTML text in http_request.responseText (not .responseXML, since it's not XML). And I know how to parse through an XML document's nodes to get the values I want. But I can't figure out how to convert that HTML string into [something like] an XML document that I can parse. Any suggestions?

And yes, I have asked the other programmer for the original database response be made available to me with an XML response, but that might take awhile...

Sheco · Nov 28, 2005

You mean like the DOM parse of HTML? [3eyes]

tystent · Nov 28, 2005

Right, DOM parse of HTML. Just an example would probably help.

cLFlaVA · Nov 28, 2005

well, this completely depends on what you're going to be accessing. you could get an array of table elements like so:

Code:

var tableArray = document.getElementsByTagName("TABLE");

and you could then consequentially loop through the rows and data cells to extract the data you need. If you want any more specific help, I'm afraid you'll need to be more specific.

*cLFlaVA
----------------------------
[tt]I already made like infinity of those at scout camp...[/tt]
[URL unfurl="true"]http://www.coryarthus.com/[/url]
[banghead]

damber · Nov 29, 2005

And here's a tutorial/reference link to get you started:

http://www.w3schools.com/htmldom/default.asp

A smile is worth a thousand kind words. So smile, it's easy!

manarth · Nov 29, 2005

tystent said:
I can get the HTML text in http_request.responseText (not .responseXML, since it's not XML).

The [tt]responseText[/tt] method returns a string (of the server response) - it can't be parsed by the DOM.
You'll need to use JavaScripts' string handling functions to extract the particular information you need.

The W3C School's reference on String Objects may be of some use, or you could google for 'JavaScript Sring' - there's lots of useful information.

Javascript does provide regex functions if your familiar with regular expressions.

---
Marcus
better questions get better answers - faq581-3339
accessible web design - zioncore.com

theniteowl · Nov 29, 2005

tystent, I am doing something similar but for different reasons.
I grab the innerHTML content of a div on the page and parse through it for form fields, modify those fields to suit my own needs and write it back into the string. This may be along the lines of what you want to do.

Code:

function formatform()
{
  var mydiv = document.getElementById('myform').innerHTML;
  var sStart='';
  var sMiddle='';
  var sNewString='';
  var elArr = objForm.elements;
  for(var i=0; i<elArr.length; i++)
  {
    var ischecked=(elArr[i].checked)?' CHECKED':'';
    var fldname=(elArr[i].name)?' name="new'+elArr[i].name+'"':'';
    var sMiddle=(elArr[i].name)?' name='+elArr[i].name:'';
    var fldvalue=(elArr[i].value)?' value="'+elArr[i].value+'"':'';
    var rawvalue=(elArr[i].value)?elArr[i].value:'';
    var fldlength=(elArr[i].value)?' size="'+elArr[i].value.length+'"':'';
    var fldsize=(elArr[i].size)?' size="'+elArr[i].size+'"':'';
    var fldmaxlength=(elArr[i].maxLength)?' maxlength="'+elArr[i].maxLength+'"':'';
    var fldrows=(elArr[i].rows)?' rows="'+elArr[i].rows+'"':'';
    var fldcols=(elArr[i].cols)?' cols="'+elArr[i].cols+'"':'';
    var fldclass=(elArr[i].className)?' class="'+elArr[i].className+'"':'';
    switch (elArr[i].type) {
      case 'radio': sStart = '<INPUT'; sEnd = '>'; sNewString = '<INPUT type="radio" disabled' + ischecked + fldname + fldclass + fldvalue + '>'; break;
      case 'checkbox': sStart = '<INPUT'; sEnd = '>'; sNewString = '<INPUT type="checkbox" disabled' + ischecked + fldname + fldclass + fldvalue + '>'; break;
      case 'select-one': sStart = '<SELECT'; sEnd = '/SELECT>'; sNewString = '<INPUT type="text" readOnly' + fldname + fldvalue + fldclass + fldlength + '>'; break;
      case 'text': sStart = '<INPUT'; sEnd = '>'; sNewString = '<INPUT type="text" readOnly' + fldname + fldvalue + fldsize + fldmaxlength + fldclass + '>'; break;
      case 'textarea': sStart = '<TEXTAREA'; sEnd = '/TEXTAREA>'; sNewString = '<TEXTAREA readOnly' + fldname + fldrows + fldcols + fldclass + '>' + rawvalue + '</TEXTAREA>'; break;
      case 'button': sStart = '<INPUT'; sEnd = '>'; sNewString = '<INPUT type="button" style="visibility:hidden;"' + fldname + fldclass + '>'; break;
      case 'submit': sStart = '<INPUT'; sEnd = '>'; sNewString = '<INPUT type="submit" style="visibility:hidden;"' + fldname + fldclass + '>'; break;
      case 'reset': sStart = '<INPUT'; sEnd = '>'; sNewString = '<INPUT type="reset" style="visibility:hidden;"' + fldname + fldclass + '>'; break;
    default: sNewString = '';
    }
    if (sNewString != '' && sMiddle != '')
    {
      var oRE = new RegExp(sStart + "[^>]*?" + sMiddle + ".*?" + sEnd, "i");
      mydiv = mydiv.replace(oRE, sNewString);
    }
  }
  document.getElementById('strOut').value = mydiv;
  return true;
}

In the code above I am pulling the innerHTML of the div called myform and storing it in a variable called mydiv.
I loop through the DOM looking up every form element one by one, grabbing the type of the element so that I know what beginning and end text to search for and grabbing the name so that I know what text would appear somewhere inside the element. Using these three parameters I use a regular expression to locate that specific substring in the mydiv variable and replace it with my modified code.

You may be able to use a similar approach to extract the data that you need from the remote page and return it as a string. You would need to be able to identify something unique in each field to give the regular expression something to anchor to grabbing all text between the first matching and end strings around that unique text.

Let us know how it turns out. I actually have a modification of my current project that will be looking to do something very similar, I want to read a remote HTML file and display a modified version of it's form data where the existing fields can be selected by click so that a string of related parameters can be assigned that is output as a separate bit of code. So I will be displaying the HTML version of the form modified so that I insert code tags around the fields making them clickable to return the field names and properties.

Paranoid? ME?? WHO WANTS TO KNOW????

tystent · Nov 29, 2005

Well, isn't that a shame. I would have thought that I could create a new object of the same type as the current "document" (what type is that, anyway?) so that I could go at it the same way I can work on the current document. In other words, parse the DOM of a different page than the one I'm currently in. The page I'm trying to parse isn't under my control; doesn't have my JS in it.

Guess it's the regex regimen for me, then....

theniteowl · Nov 29, 2005

That does complicate things.
It depends a lot on what type of manipulation you have to perform on the string.
Is the format of the data on the page predictable? Is there always the same amount of data or always the same field names? If so it will be easy to parse through the data to find what you need given that you already know what key text to look for so that you can extract it.
If the page provides data dynamically as long as the field names are in a predictable sequence it should not be a problem.

Paranoid? ME?? WHO WANTS TO KNOW????

tystent · Nov 30, 2005

I did experiment a bit with creating a new document from the HTML text returned by the query: ( obj = document.open(...), obj.write(the http response text) and then use the DOM to find things in it, but it kept bogging down on the totally non-validating HTML that they are feeding me.

Eventually I just switched to string-searching since their response is quite predictable (just a list of names with a lot of HTML decoration, really). Works just fine at home, on a recorded response file.

But when I moved it all out to the real site and use it on the live link, I get "permission denied", I guess because the link I'm GETting from is on a different server from where my page is (same domain). Yet I am allowed to just place a hyperlink on my page to their page and it of course works fine. What permissions would need to change on their server to allow this to work? Or does that question make sense?

manarth · Nov 30, 2005

sounds like you're running into barriers designed to prevent cross site scripting.

my only suggestion is to run a server-side script, instead of client side.

---
Marcus
better questions get better answers - faq581-3339
accessible web design - zioncore.com

theniteowl · Dec 1, 2005

I do not know what limitations might be on XMLHTTP requests to remote servers.
The first question is, if it is a limitation imposed to prevent cross site scripting then is it incorporated at the browser end or at the server? I do not know much on the subject but if it is browser side it may be a trust setting they can apply for your server since you are on the same domain.

I am very interested in the outcome but the question has gone beyond what I have experience with. Maybe someone else here has ideas?

Paranoid? ME?? WHO WANTS TO KNOW????

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

"AJAX" for HTML, not XML

tystent

Programmer

Sheco

Programmer

tystent

Programmer

cLFlaVA

Programmer

damber

Programmer

manarth

Programmer

theniteowl

Programmer

tystent

Programmer

theniteowl

Programmer

tystent

Programmer

manarth

Programmer

theniteowl

Programmer

Similar threads

Part and Inventory Search

Sponsor

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

&quot;AJAX&quot; for HTML, not XML

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Programmer

Similar threads

Log in

Part and Inventory Search

Sponsor

"AJAX" for HTML, not XML