Hi Daniel,

On Mon, Mar 21, 2016 at 7:11 AM, Daniel Beardsley <dan...@ifixit.com> wrote:
> I'd like to submit an RFC (with a pull request) for adding auto-escaping to
> the php language.
>
> We at iFixit.com have used PHP for nearly a decade to run our website.
> Several years ago, we abandoned the Smarty templating engine and used php
> files directly as templates. This worked, but was a bit unsafe and made it
> too easy to leave user submitted content unescaped. Several years ago we
> switched to using a modified version of PHP that included auto-escaping and
> it has been working great. In the process of preparing to use php 7, I've
> re-implemented the changes against the master branch.
>
> I'd like to gauge interest in this before I formally submit an RFC. Here's
> a somewhat better description that I've attached to a pull request on our
> internal fork of php.
>
> Pull request on internal fork: https://github.com/iFixit/php-src/pull/14
>
> Background
> ==========
> PHP doesn't have any mechanism to inject logic between templating
> and final output. There is no way to filter or alter the content
> that comes from code in templates like: <?= $someVar ?>
>
> To use php as a robust templataing language, we must inject *some*
> logic between templates and their output. We have chosen to make
> all <?=, echo, and print statements subject to an optional
> trip through the internal function php_escape_html_entitiles.
>
> The functionality can be toggled with `ini_set('__auto_escape')`
> and configured with `__auto_escape_flags` and
> `__auto_escape_exempt_class` (see commit
> https://github.com/iFixit/php-src/commit/2dae5d16436ce37856f6e00ca2a1b3009bb1f7ed
> for info about the class name based auto-escaping exemption.
>
> Methodology
> ===========
> T_ECHO (echo, <?=) and T_PRINT (print) now both emit a
> ZEND_AST_ECHO_ESCAPE node in the syntax tree.
>
> That's compiled to a function which emits a ZEND_ECHO_ESCAPE op code.
>
> The op code interpretation is a dupe of ZEND_ECHO except with some
> if() statements that switch the underlying function from `zend_write`
> to `zend_write_escape` based on the ini settings.
>
> zend_write_escape is a new function pointer that points to
> php_escape_write.
>
> php_escape_write is a new function that passes it's string argument
> through php_escape_html_entities() (with __auto_escape_flags) before
> calling the underlying php_output_write.
>
> Use
> ===
> This functionality allows us to safely use php straight as a
> templating language with no template compilation step (as many
> other templating libraries have).
>
> See the included tests for more usage information.
>
> Exempt Class
> ============
> It is useful to allow some utility functions and helpers to produce
> html and have it passed straight through in the template (without
> being double-encoded). We accomplish this by *tagging* strings
> as being HTML.
>
>     class HtmlString implements JsonSerializable {
>        protected $html = '';
>
>        public function __construct($html) {
>           $this->html = $html;
>        }
>
>        public function __toString() {
>           return (string)$this->html;
>        }
>
>        public function jsonSerialize() {
>           return $this->html;
>        }
>     }
>
> The auto-escaping system can be configured with an:
> __auto_escape_exempt_class="HtmlString"
>
> Which allows instances of `HtmlString` to pass straight through a
> template without being modified (skipping the html_entities call).
> Helper functions can now return html safely and consumers don't have
> to care if it is HTML or not because the auto-escaping system knows
> what to do.
>
> Thanks for your consideration!
> Daniel Beardsley

Issue is "Escaping is done on a specific context".

I understand your proposal is focused on HTML escaping. However,
setting names like
__auto_escape_exempt_class
is not good choice. It has to be
__auto_html_escape_exempt_class
at least because it is for HTML escaping.

In addition, HTML consists of multiple contexts

 - HTML context that requires HTML escape
 - URI context that requires URI escape
 - JavaScript context, embedded JavaScript strings for example , that
   requires JavaScript string escape, etc.
   e.g. http://blog.ohgaki.net/javascript-string-escape (Sorry. It's
   my blog and written in Japanese.
   You may try translation service or you should be able to understand
   PHP code at least)
 - CSS context that requires CSS escape.
   e.g. https://developer.mozilla.org/ja/docs/Web/API/CSS/escape
 - And so on

Dealing HTML context only would be problematic even if it works for many cases.

Escaping must be done depends on context. Multiple contexts may apply
also. HTML context only escaping would not work well.. Applying proper
escapes to variables in HTML is very complex task..

Regards,

--
Yasuo Ohgaki
yohg...@ohgaki.net

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to