It's important to escape output according to context. PHP provides functions such as htmlspecialchars() to escape output when the context is HTML. However, one often desires to allow some subset of HTML through without escaping (e.g., <br />, <b></b>, etc.)
Functions such as strip_tags() do allow whitelisting, but their usage poses security risks due to lingering attributes (e.g., strip_tags('<b onclick="alert(\'Oh no!\')">click me</b>', '<b>'.) One can develop a more robust mechanism in userland that first escapes input using htmlspecialchars() and then unescapes whitelisted sequences. Because of the variance in html tags due to potential attributes (e.g., optionally including various classes, img src attributes, etc), offering the ability to optionally specify a whitelist sequence through use of a regex could also offer significant benefits (e.g., any string sequence starting and ending with '/' will be handled as a regex.) However, the common nature of this need, coupled with the performance benefits of implementing this internally prompts my interest in two options. - Add a fifth parameter to htmlspecialchars() that takes an array of whitelisted sequences. Even though this seems like a terribly long function to call, one could easily wrap the call in a facade function. - Add a new function called str_escape(), but this introduces potential BC issues. There are of course other options (e.g., integrate this as an additional filter, etc.) I've built an extension that, while focused on an old web framework of mine, contains a function that can serve as a proof-of-concept that implements the functionality I've outlined above (see nephtali_str_escape_html): https://github.com/AdamJonR/nephtali-php-ext/blob/master/nephtali.c I've tossed out the idea on this list before, but it was only tangentially related to the discussion at the time. At this point, I'd really like to focus on this idea directly to see what approach might seem wisest (including doing nothing, if the frequency of use does not justify bringing the functionality into the core.) Thoughts? Adam -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php