Wouldn't this __auto_escape setting effectively break libraries that depend on it being on or off?
People often write code to generate HTML like this: ob_start(); ?> <div>some HTML <?= escape_html($other_text) ?></div> <div>more HTML <?= $other_html ?></div> <!-- etc --> <?php $html = ob_get_clean(); If that code is in a library, it can't be used with this setting enabled. That could become a real pain point for the whole PHP ecosystem. On Mon, Mar 21, 2016 at 3:53 PM, Yasuo Ohgaki <yohg...@ohgaki.net> wrote: > Hi Daniel, > > On Mon, Mar 21, 2016 at 7:11 AM, Daniel Beardsley <dan...@ifixit.com> > wrote: > > I'd like to submit an RFC (with a pull request) for adding auto-escaping > to > > the php language. > > > > We at iFixit.com have used PHP for nearly a decade to run our website. > > Several years ago, we abandoned the Smarty templating engine and used php > > files directly as templates. This worked, but was a bit unsafe and made > it > > too easy to leave user submitted content unescaped. Several years ago we > > switched to using a modified version of PHP that included auto-escaping > and > > it has been working great. In the process of preparing to use php 7, I've > > re-implemented the changes against the master branch. > > > > I'd like to gauge interest in this before I formally submit an RFC. > Here's > > a somewhat better description that I've attached to a pull request on our > > internal fork of php. > > > > Pull request on internal fork: https://github.com/iFixit/php-src/pull/14 > > > > Background > > ========== > > PHP doesn't have any mechanism to inject logic between templating > > and final output. There is no way to filter or alter the content > > that comes from code in templates like: <?= $someVar ?> > > > > To use php as a robust templataing language, we must inject *some* > > logic between templates and their output. We have chosen to make > > all <?=, echo, and print statements subject to an optional > > trip through the internal function php_escape_html_entitiles. > > > > The functionality can be toggled with `ini_set('__auto_escape')` > > and configured with `__auto_escape_flags` and > > `__auto_escape_exempt_class` (see commit > > > https://github.com/iFixit/php-src/commit/2dae5d16436ce37856f6e00ca2a1b3009bb1f7ed > > for info about the class name based auto-escaping exemption. > > > > Methodology > > =========== > > T_ECHO (echo, <?=) and T_PRINT (print) now both emit a > > ZEND_AST_ECHO_ESCAPE node in the syntax tree. > > > > That's compiled to a function which emits a ZEND_ECHO_ESCAPE op code. > > > > The op code interpretation is a dupe of ZEND_ECHO except with some > > if() statements that switch the underlying function from `zend_write` > > to `zend_write_escape` based on the ini settings. > > > > zend_write_escape is a new function pointer that points to > > php_escape_write. > > > > php_escape_write is a new function that passes it's string argument > > through php_escape_html_entities() (with __auto_escape_flags) before > > calling the underlying php_output_write. > > > > Use > > === > > This functionality allows us to safely use php straight as a > > templating language with no template compilation step (as many > > other templating libraries have). > > > > See the included tests for more usage information. > > > > Exempt Class > > ============ > > It is useful to allow some utility functions and helpers to produce > > html and have it passed straight through in the template (without > > being double-encoded). We accomplish this by *tagging* strings > > as being HTML. > > > > class HtmlString implements JsonSerializable { > > protected $html = ''; > > > > public function __construct($html) { > > $this->html = $html; > > } > > > > public function __toString() { > > return (string)$this->html; > > } > > > > public function jsonSerialize() { > > return $this->html; > > } > > } > > > > The auto-escaping system can be configured with an: > > __auto_escape_exempt_class="HtmlString" > > > > Which allows instances of `HtmlString` to pass straight through a > > template without being modified (skipping the html_entities call). > > Helper functions can now return html safely and consumers don't have > > to care if it is HTML or not because the auto-escaping system knows > > what to do. > > > > Thanks for your consideration! > > Daniel Beardsley > > Issue is "Escaping is done on a specific context". > > I understand your proposal is focused on HTML escaping. However, > setting names like > __auto_escape_exempt_class > is not good choice. It has to be > __auto_html_escape_exempt_class > at least because it is for HTML escaping. > > In addition, HTML consists of multiple contexts > > - HTML context that requires HTML escape > - URI context that requires URI escape > - JavaScript context, embedded JavaScript strings for example , that > requires JavaScript string escape, etc. > e.g. http://blog.ohgaki.net/javascript-string-escape (Sorry. It's > my blog and written in Japanese. > You may try translation service or you should be able to understand > PHP code at least) > - CSS context that requires CSS escape. > e.g. https://developer.mozilla.org/ja/docs/Web/API/CSS/escape > - And so on > > Dealing HTML context only would be problematic even if it works for many > cases. > > Escaping must be done depends on context. Multiple contexts may apply > also. HTML context only escaping would not work well.. Applying proper > escapes to variables in HTML is very complex task.. > > Regards, > > -- > Yasuo Ohgaki > yohg...@ohgaki.net > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > >