Hi to everybody,
I have to verify if a query result contains an intere word or a part of it, to discriminate whether or not to consider it valid. The algorithm core is represented by the following command:
preg_match_all("/([a-z]|[A-Z]|)".$topic."([a-z]|[A-Z]|)/i", $result[$j], $content1);
(This is a php function, but as far as I know you perl guys are the RegExp Wizards)
As you can see it verifies if there are alfabetic characters immediatly around my topic. It does its job pretty well but when $topic contains an accented character. I.e. $topic = "probléme" won't match a string within which there is probléme, due to that 'é'. Nevertheless if it was $topic = "probleme" it would have matched probleme into a string containing it.
It really puzzles me. Does somebody know an explanation to this behavior?
Thank you in advance for any help.
Max.
I have to verify if a query result contains an intere word or a part of it, to discriminate whether or not to consider it valid. The algorithm core is represented by the following command:
preg_match_all("/([a-z]|[A-Z]|)".$topic."([a-z]|[A-Z]|)/i", $result[$j], $content1);
(This is a php function, but as far as I know you perl guys are the RegExp Wizards)
As you can see it verifies if there are alfabetic characters immediatly around my topic. It does its job pretty well but when $topic contains an accented character. I.e. $topic = "probléme" won't match a string within which there is probléme, due to that 'é'. Nevertheless if it was $topic = "probleme" it would have matched probleme into a string containing it.
It really puzzles me. Does somebody know an explanation to this behavior?
Thank you in advance for any help.
Max.