Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

simple dom problem

Status
Not open for further replies.

someoneneedshelps

Technical User
Mar 3, 2016
75
GB
Im search some html for the £4.50 out of this...

Code:
<span id="fshippingCost" class="notranslate sh-cst "><span>£4.50</span>

using this code
Code:
	foreach($html->find('span[id=fshippingCost]') as $shippingcost)
	{
		$arrayresults[] = $shippingcost->plaintext;
		$shippingcoststr = $shippingcost->plaintext;
	}

strangest result back is £17.33 why???
 
Using what DOM parser??

Chris.

Indifference will be the downfall of mankind, but who cares?
Time flies like an arrow, however, fruit flies like a banana.
Webmaster Forum
 
Is it possible there's another span with that ID somewhere else in he HTML that has that value?

Are you doing anything else with the array before outputting to screen?

How many elements end up in the $arrayresults array?



----------------------------------
Phil AKA Vacunita
----------------------------------
Ignorance is not necessarily Bliss, case in point:
Unknown has caused an Unknown Error on Unknown and must be shutdown to prevent damage to Unknown.

Web & Tech
 
I have just changed to this which still comes up with this £17.33 .?? it doesn't exist see above
Code:
foreach($html->find('span[class=sh-cst]') as $key => $info) 
{
	echo ($key + 1).'. '.$info->plaintext."<br />\n";
}
 
If I was to make an efficient attempt to attract the £4.50 what would be the simplest method? I have other ones to track also so if I could clinch it that would be marvellous
 
Ive just done this to make sure im not going mad and I get the correct result so what is the problem with above?

Code:
	$html1 = str_get_html('<span id="fshippingCost" class="notranslate sh-cst "><span>£4.50</span>');
	$html1->find('span[class=sh-cst]');
	echo "<br>meeeee".$html1->plaintext."<br>";
 
Ok, what do i search when locating span elements? Should i just use all elements and pull out the id of the one i want? thats not very good if modificatios are made
 
Should i just use all elements and pull out the id of the one i want?
Yes

thats not very good if modificatios are made
And??

That's just the way it is, your task in that case is to detect something has changed and handle it in some way.


Chris.

Indifference will be the downfall of mankind, but who cares?
Time flies like an arrow, however, fruit flies like a banana.
Webmaster Forum
 
I'm still getting the wrong value out, im baffled, I use span span span to bring the list down and still its the wrong value
 
Have you tried removing the '£' and casting the resulting string to a floating point value before trying any mathematical operations?

Have you done any "debugging 101" steps, such as writing the 'scraped' string value to the browser to verify that you are getting the results you expect from the DOM extraction?



Chris.

Indifference will be the downfall of mankind, but who cares?
Time flies like an arrow, however, fruit flies like a banana.
Webmaster Forum
 
I did you first suggestion and it turns out has 17.33 TODAY?? its 17.34?

I did this

when you say scraping you mean looking for a word? yes and it comes back negative... heres the code, does it need any adjustment?

Code:
if(scraping_for_text($url,'Buy it now'))
   $binstr = "Buy it now";
else
   $binstr = "Not Found";*/

function scraping_for_text($iUrl,$iText)
{
	echo "iUrl=".$iUrl."<br />";
	echo "iText=".$iText."<br />";

    // create HTML DOM
    $html = file_get_html($iUrl);

    // get text elements
    $aObj = $html->find($iText);
    if (count($aObj) > 0)
    {
	   echo "<BR>Num = " . count($aObj) . "<br>";
       $condition = True;
	   echo "<h4>Found ".$iText."</h4>";
    }
    else
    {
	   echo "<BR>Num = " . count($aObj) . "<br>";
       $condition = false; 
	   echo "<h4>No ".$iText." found"."</h4>";
    }    

    // clean up memory
    $html->clear();
    unset($html);

    return $condition;
}
 
Ive tried everything, got stupid bug I cannot solve.... how does £4.50 become NOW.... £17.34
 
scraping" is a term used to describe the action of requesting a URL then extracting the text from the response rather than rendering and displaying it in a browser.

What is the URL of the page you are extracting from?

and when you said it is "simple dom html parser", should that be Simple HTML DOM Parser meaning this component

Chris.

Indifference will be the downfall of mankind, but who cares?
Time flies like an arrow, however, fruit flies like a banana.
Webmaster Forum?

or this ?

or something else?
 
Im extracting of many pages of similar format, ie: <span id=thesameineverypage> etc and yes its the component above im using
 
Im extracting of many pages of similar format,

Just a representative one will do. Without that how exactly are we supposed to figure out all the possible problems?

Chris.

Indifference will be the downfall of mankind, but who cares?
Time flies like an arrow, however, fruit flies like a banana.
Webmaster Forum
 
this has never failed...
Code:
	foreach($html->find('span[itemprop=price]') as $price)
	{
		$arrayresults[] = $price->plaintext;
		$pricestr = $price->plaintext;
	}

the above is and its the same idea as above, pull out the span element that has itemprop = price
 

I have no idea, both ate there one works one doesnt

Okay, I'll point out what it probably demonstrates then.


It demonstrates that there are differences in the source of the data.

If it works with 'A' but not with 'B' then 'B' is different to 'A'

Chris.

Indifference will be the downfall of mankind, but who cares?
Time flies like an arrow, however, fruit flies like a banana.
Webmaster Forum
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top