I'd like to submit an RFC (with a pull request) for adding auto-escaping to
the php language.

We at iFixit.com have used PHP for nearly a decade to run our website.
Several years ago, we abandoned the Smarty templating engine and used php
files directly as templates. This worked, but was a bit unsafe and made it
too easy to leave user submitted content unescaped. Several years ago we
switched to using a modified version of PHP that included auto-escaping and
it has been working great. In the process of preparing to use php 7, I've
re-implemented the changes against the master branch.

I'd like to gauge interest in this before I formally submit an RFC. Here's
a somewhat better description that I've attached to a pull request on our
internal fork of php.

Pull request on internal fork: https://github.com/iFixit/php-src/pull/14

Background
==========
PHP doesn't have any mechanism to inject logic between templating
and final output. There is no way to filter or alter the content
that comes from code in templates like: <?= $someVar ?>

To use php as a robust templataing language, we must inject *some*
logic between templates and their output. We have chosen to make
all <?=, echo, and print statements subject to an optional
trip through the internal function php_escape_html_entitiles.

The functionality can be toggled with `ini_set('__auto_escape')`
and configured with `__auto_escape_flags` and
`__auto_escape_exempt_class` (see commit
https://github.com/iFixit/php-src/commit/2dae5d16436ce37856f6e00ca2a1b3009bb1f7ed
for info about the class name based auto-escaping exemption.

Methodology
===========
T_ECHO (echo, <?=) and T_PRINT (print) now both emit a
ZEND_AST_ECHO_ESCAPE node in the syntax tree.

That's compiled to a function which emits a ZEND_ECHO_ESCAPE op code.

The op code interpretation is a dupe of ZEND_ECHO except with some
if() statements that switch the underlying function from `zend_write`
to `zend_write_escape` based on the ini settings.

zend_write_escape is a new function pointer that points to
php_escape_write.

php_escape_write is a new function that passes it's string argument
through php_escape_html_entities() (with __auto_escape_flags) before
calling the underlying php_output_write.

Use
===
This functionality allows us to safely use php straight as a
templating language with no template compilation step (as many
other templating libraries have).

See the included tests for more usage information.

Exempt Class
============
It is useful to allow some utility functions and helpers to produce
html and have it passed straight through in the template (without
being double-encoded). We accomplish this by *tagging* strings
as being HTML.

    class HtmlString implements JsonSerializable {
       protected $html = '';

       public function __construct($html) {
          $this->html = $html;
       }

       public function __toString() {
          return (string)$this->html;
       }

       public function jsonSerialize() {
          return $this->html;
       }
    }

The auto-escaping system can be configured with an:
__auto_escape_exempt_class="HtmlString"

Which allows instances of `HtmlString` to pass straight through a
template without being modified (skipping the html_entities call).
Helper functions can now return html safely and consumers don't have
to care if it is HTML or not because the auto-escaping system knows
what to do.

Thanks for your consideration!
Daniel Beardsley

Reply via email to