On Tue, Sep 18, 2012 at 7:30 AM, Pádraic Brady <padraic.br...@gmail.com> wrote:
> Hi all,
>
> I've written an RFC for PHP over at: https://wiki.php.net/rfc/escaper.
> The RFC is a proposal to implement a standardised means of escaping
> data which is being output into XML/HTML.
>
> https://wiki.php.net/rfc/escaper

Some Quick Thoughts


Multiparadigm PHP

I hope any implementation would embrace procedural coding paradigms
AND OOP paradigms. I tend to code using a Functional Programming (FP)
style, and I don't need/want objects to be the only interface.


Extension First

It seems wise to get this working and tested as an extension first,
just as Rasmus and others suggested.



Ability To Pass Some HTML Through Without Escaping (Whitelisting)

Functions should allow whitelisting of elements when desired. For
example, html escaping may be desired for all elements in a paragraph
except for spans, br's, etc.

I've built a quick extension that I use in my web framework that does this:
https://github.com/AdamJonR/nephtali-php-ext

string nephtali_str_escape_html(string str [, array whitelist [,
string charset]])

The escaping works as outlined below:

1) Escape all html special characters in str.
2) Loop through whitelist items.
3a) If the item begins and ends with '/', consider it a regex and
replace the matches in the string with the original (htmlspecialchars
decoded) text (this works because <,>,",', and & are not meta
characters in regexes.)
3b) Otherwise, handle as a standard string and replace the matches
with the unescaped whitelist item text.

The idea is that, to be safe, everything should be first escaped.
Then, only unescape the items that match the whitelist (e.g.,
array('<p>','</p>','etc.').) The regex option is handy because you
often have situations where the internal contents of the tag vary
(e.g., id, class, href, etc.) and this allows you to pass these
through unescaped.

Of note, I've not officially released the extension, as I'm still
testing/developing it, but it serves as an example for ideas.


PHP Escaping-Specific Tags Could Be Considered

I wonder if PHP tags for escaping could be considered, as it seems
that there's still a plurality of developers that use PHP itself as
the templating language. For example:

// automatically echo'd and escaped for special html chars
 <?php:html $obj->val ?>

// automatcially echo'd and escaped for special html chars whilst
letting through p's
 <?php:html $obj->val, array('<p>','</p>') ?>

// automatcially echo'd and escaped for special html chars whilst
letting through p's and using different encoding
 <?php:html $obj->val, array('<p>','</p>'), $encoding = 'something' ?>

// automatcially echo'd and escaped for special html chars, no
whitelisting allowed
 <?php:attr $obj->val ?>

// automatcially echo'd and escaped for special url chars, no
whitelisting allowed
 <?php:url $obj->val ?>

Thanks,

Adam

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to