On Tue, Sep 18, 2012 at 7:30 AM, Pádraic Brady <padraic.br...@gmail.com> wrote: > Hi all, > > I've written an RFC for PHP over at: https://wiki.php.net/rfc/escaper. > The RFC is a proposal to implement a standardised means of escaping > data which is being output into XML/HTML. > > https://wiki.php.net/rfc/escaper
Some Quick Thoughts Multiparadigm PHP I hope any implementation would embrace procedural coding paradigms AND OOP paradigms. I tend to code using a Functional Programming (FP) style, and I don't need/want objects to be the only interface. Extension First It seems wise to get this working and tested as an extension first, just as Rasmus and others suggested. Ability To Pass Some HTML Through Without Escaping (Whitelisting) Functions should allow whitelisting of elements when desired. For example, html escaping may be desired for all elements in a paragraph except for spans, br's, etc. I've built a quick extension that I use in my web framework that does this: https://github.com/AdamJonR/nephtali-php-ext string nephtali_str_escape_html(string str [, array whitelist [, string charset]]) The escaping works as outlined below: 1) Escape all html special characters in str. 2) Loop through whitelist items. 3a) If the item begins and ends with '/', consider it a regex and replace the matches in the string with the original (htmlspecialchars decoded) text (this works because <,>,",', and & are not meta characters in regexes.) 3b) Otherwise, handle as a standard string and replace the matches with the unescaped whitelist item text. The idea is that, to be safe, everything should be first escaped. Then, only unescape the items that match the whitelist (e.g., array('<p>','</p>','etc.').) The regex option is handy because you often have situations where the internal contents of the tag vary (e.g., id, class, href, etc.) and this allows you to pass these through unescaped. Of note, I've not officially released the extension, as I'm still testing/developing it, but it serves as an example for ideas. PHP Escaping-Specific Tags Could Be Considered I wonder if PHP tags for escaping could be considered, as it seems that there's still a plurality of developers that use PHP itself as the templating language. For example: // automatically echo'd and escaped for special html chars <?php:html $obj->val ?> // automatcially echo'd and escaped for special html chars whilst letting through p's <?php:html $obj->val, array('<p>','</p>') ?> // automatcially echo'd and escaped for special html chars whilst letting through p's and using different encoding <?php:html $obj->val, array('<p>','</p>'), $encoding = 'something' ?> // automatcially echo'd and escaped for special html chars, no whitelisting allowed <?php:attr $obj->val ?> // automatcially echo'd and escaped for special url chars, no whitelisting allowed <?php:url $obj->val ?> Thanks, Adam -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php