I'm having problem with regular expression, not a good eek this week it
seen like I alway's get a wall of problem. I know that it surely been
ask a 1000 times, I look around, didn't find anythings, if you find
somethings please point me out.
So here what I want to do, I need to parse a xml document , but before
to parse it I need to get rid of bad html that I don't want, but the
document that I want require some stuff that I need too, so I don't want
to get ride of all they HTML.
So what I want to do, I already did a little bite of code that get out
my good element and check for bad stuff, the only bad thing is that
"text<text-1" is a good stuff, but I need to change < to < or it will
do bad things with my xml parser.
Here what I try
$simple = <<<XMLDATA
<?xml version='1.0'?>
<!DOCTYPE chapter SYSTEM "/just/a/test.dtd" [
<!ENTITY plainEntity "FOO entity">
<!ENTITY systemEntity SYSTEM "xmltest2.xml">
]>
<item>
text
<bad stuff>
text<text-1
text
<image title="Ceci est mon titre2" description="Ceci est ma
description"
link="http://www.windplanet.com/"
url="http://www.windplanet.com/images/news/988991159.gif"
align="left" width="235" height="131" size="13310"/>
text
text
<image title="Ceci est mon titre" description="Ceci est ma description"
link="http://www.windplanet.com/"
url="http://www.windplanet.com/images/news/988991159.gif" align="left"
width="235" height="131" size="13310"/>
</item>
XMLDATA;
//$simple = str_replace("\n\n"," <br/> <br/> ",$simple);
/* trouve moi tous les < sauf suivant ceci ... */
$data = $simple;
print $data;
if(preg_match_all("/\<(?:(?:\!|\/|\?|)(?:<!xml|<!DOCTYPE|<!ENTITY|<!image|<!item|))/",$data,$cbadhtml)){
foreach( $cbadhtml as $key => $myarray){
foreach( $myarray as $key2 => $myarray2){
print "<p><font color='red'>You can't use HTML here so ".
htmlentities($myarray2) ." is not allowed</font></p>\n";
}
}
// what html? we exit
//exit;
}
It find all the < but doesnt' remove the one that I accept, so how can I
find the bad < and transform them to < ?
Thank you and have a nice day.
--
Francis Fillion, BAA SI
Broadcasting live from his linux box.
And the maintainer of http://www.windplanet.com
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]