Wouldn't this __auto_escape setting effectively break libraries that depend
on it being on or off?

People often write code to generate HTML like this:

ob_start();
?>

<div>some HTML <?= escape_html($other_text) ?></div>

<div>more HTML <?= $other_html ?></div>

<!-- etc -->

<?php
$html = ob_get_clean();


If that code is in a library, it can't be used with this setting enabled.
That could become a real pain point for the whole PHP ecosystem.


On Mon, Mar 21, 2016 at 3:53 PM, Yasuo Ohgaki <yohg...@ohgaki.net> wrote:

> Hi Daniel,
>
> On Mon, Mar 21, 2016 at 7:11 AM, Daniel Beardsley <dan...@ifixit.com>
> wrote:
> > I'd like to submit an RFC (with a pull request) for adding auto-escaping
> to
> > the php language.
> >
> > We at iFixit.com have used PHP for nearly a decade to run our website.
> > Several years ago, we abandoned the Smarty templating engine and used php
> > files directly as templates. This worked, but was a bit unsafe and made
> it
> > too easy to leave user submitted content unescaped. Several years ago we
> > switched to using a modified version of PHP that included auto-escaping
> and
> > it has been working great. In the process of preparing to use php 7, I've
> > re-implemented the changes against the master branch.
> >
> > I'd like to gauge interest in this before I formally submit an RFC.
> Here's
> > a somewhat better description that I've attached to a pull request on our
> > internal fork of php.
> >
> > Pull request on internal fork: https://github.com/iFixit/php-src/pull/14
> >
> > Background
> > ==========
> > PHP doesn't have any mechanism to inject logic between templating
> > and final output. There is no way to filter or alter the content
> > that comes from code in templates like: <?= $someVar ?>
> >
> > To use php as a robust templataing language, we must inject *some*
> > logic between templates and their output. We have chosen to make
> > all <?=, echo, and print statements subject to an optional
> > trip through the internal function php_escape_html_entitiles.
> >
> > The functionality can be toggled with `ini_set('__auto_escape')`
> > and configured with `__auto_escape_flags` and
> > `__auto_escape_exempt_class` (see commit
> >
> https://github.com/iFixit/php-src/commit/2dae5d16436ce37856f6e00ca2a1b3009bb1f7ed
> > for info about the class name based auto-escaping exemption.
> >
> > Methodology
> > ===========
> > T_ECHO (echo, <?=) and T_PRINT (print) now both emit a
> > ZEND_AST_ECHO_ESCAPE node in the syntax tree.
> >
> > That's compiled to a function which emits a ZEND_ECHO_ESCAPE op code.
> >
> > The op code interpretation is a dupe of ZEND_ECHO except with some
> > if() statements that switch the underlying function from `zend_write`
> > to `zend_write_escape` based on the ini settings.
> >
> > zend_write_escape is a new function pointer that points to
> > php_escape_write.
> >
> > php_escape_write is a new function that passes it's string argument
> > through php_escape_html_entities() (with __auto_escape_flags) before
> > calling the underlying php_output_write.
> >
> > Use
> > ===
> > This functionality allows us to safely use php straight as a
> > templating language with no template compilation step (as many
> > other templating libraries have).
> >
> > See the included tests for more usage information.
> >
> > Exempt Class
> > ============
> > It is useful to allow some utility functions and helpers to produce
> > html and have it passed straight through in the template (without
> > being double-encoded). We accomplish this by *tagging* strings
> > as being HTML.
> >
> >     class HtmlString implements JsonSerializable {
> >        protected $html = '';
> >
> >        public function __construct($html) {
> >           $this->html = $html;
> >        }
> >
> >        public function __toString() {
> >           return (string)$this->html;
> >        }
> >
> >        public function jsonSerialize() {
> >           return $this->html;
> >        }
> >     }
> >
> > The auto-escaping system can be configured with an:
> > __auto_escape_exempt_class="HtmlString"
> >
> > Which allows instances of `HtmlString` to pass straight through a
> > template without being modified (skipping the html_entities call).
> > Helper functions can now return html safely and consumers don't have
> > to care if it is HTML or not because the auto-escaping system knows
> > what to do.
> >
> > Thanks for your consideration!
> > Daniel Beardsley
>
> Issue is "Escaping is done on a specific context".
>
> I understand your proposal is focused on HTML escaping. However,
> setting names like
> __auto_escape_exempt_class
> is not good choice. It has to be
> __auto_html_escape_exempt_class
> at least because it is for HTML escaping.
>
> In addition, HTML consists of multiple contexts
>
>  - HTML context that requires HTML escape
>  - URI context that requires URI escape
>  - JavaScript context, embedded JavaScript strings for example , that
>    requires JavaScript string escape, etc.
>    e.g. http://blog.ohgaki.net/javascript-string-escape (Sorry. It's
>    my blog and written in Japanese.
>    You may try translation service or you should be able to understand
>    PHP code at least)
>  - CSS context that requires CSS escape.
>    e.g. https://developer.mozilla.org/ja/docs/Web/API/CSS/escape
>  - And so on
>
> Dealing HTML context only would be problematic even if it works for many
> cases.
>
> Escaping must be done depends on context. Multiple contexts may apply
> also. HTML context only escaping would not work well.. Applying proper
> escapes to variables in HTML is very complex task..
>
> Regards,
>
> --
> Yasuo Ohgaki
> yohg...@ohgaki.net
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

Reply via email to