Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

preg_match_all

Status
Not open for further replies.

elck

Programmer
Apr 19, 2004
176
NL
Hi there!

I am trying to read all the <td>..</td> blocks from a webpage (to convert it to a db-driven site)

I don't really understand preg, but I thought this would come in very handy.
At first it seemed to work, but now i get strange results.
What I am doing wrong here?

Code:
$voidstring=preg_match_all('/<td([^`]*?)<\/td/i',$buffer,$tds);
foreach ($tds as $a) { ...etc
I have a feeling that I am making things very complicated
 
Hi,
your buffer could be a multiline (with NEW_LINE character).
Have to use `m' switch.
so try:
Code:
preg_match_all('/<td[^>]*?>(.*?)<\/td/mi',$buffer,$tds);

Also, check if your <TD > tags are always closed , with </TD>

PM

___
____
 
A few notes:
-) I would use both the 'i' and the 's' pattern modifiers so the . (dot) char includes newlines. There is the possibility that a table cell contains newline characters before the closing tag.
-) What is it that you want to extract? The content of the <td>?
-) There is the need for taking into consideration that the opening <td> tag might have attributes like class or align.

Let's see if we can com up with a better expression:
Code:
$pattern = "/<td[^>]*>(.*)<\/td>/Usi";
I used the U pattern modifier to make the whole pattern ungreedy. The [^>]* part catches all possible attributes within the <td> tag.
If you want a more detailed explanation, let me know.
 
Thanks guys, your patterns work.
I tried them in a separate test file.

My main file does not work yet, so I am messing up somewhere else.

At least now I know what part is fine ;)

See ya!
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top