Hi!

>    if ($allowed_html) {
>       // cycle through the whitelisted sequences
>       foreach($allowed_html as $sequence) {

What is supposed to be in $allowed_html? If those are simple fixed
strings and such, why can't you just do preg_split with
PREG_SPLIT_DELIM_CAPTURE and encode each other element of the result, or
PREG_SPLIT_OFFSET_CAPTURE if you need something more interesting?

I would seriously advise though against trying to do HTML parsing with
regexps unless they are very simple, since browsers will accept a lot of
broken HTML and will happily run scripts in it, etc.

> Bridging the gap between strip_tags and htmlspecialchars seems like a
> reasonable consideration for PHP's core. While I do use HTMLPurifier

I think with level of complexity that is needed to cover anything but
the most primitive cases, you need a full-blown HTML/XML parser there.
Which we do have, so why not use any of them instead of reinventing
them, if that's what you need?
-- 
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to