> PHP today is a programming language, and applications and libraries can be and are written in that programming language.
PHP has <?= ?> and <?php ?> tags, all outside these tags is considered as HTML. It is needed or to remove these tags and use PHP as programming language only, or to improve usage of these tags. Because <?= ?> tags itself without additional handling causes XSS vulnerabilities. > Trying to build default functionality that would compete with a modern templating engine like Twig would be a lot of effort, and to what end? A kind of language nationalism, that "PHP does it all"? This operator (or tag) is intended for that applications which are already writte and already do not have template engine, but are developed and require to write code. Also, there are frameworks or CMS, which do not have built-in template engine, and people start new projects using them. Also, this operator can be useful for junior programmers, who know PHP but don't know some template engine yet. > register_escape_handler('foo', [$this, 'escape']); > <?*foo= $something ?> > Where's the problem? > If you mean you want to be able to pass an actual callable as the context No problems with the code, I anwered to "IDE will have problem by identify where you have defined it". I did not mean a callable as a context. > <?= will still "work good but be unsafe" But people will be allowed to not use it at all. They could even create a rule about it in their code style guides. > it doesn't really matter if you say the incantation to output a variable is "<?= e($" followed by the variable name and ")?>", or "<?* $" followed by the variable name and "?>". It does matter. He can try to remove unnecessary 'e' and see that it still works good. With old operator he can write unsafe code without additional actions. With new operator he should specially set 'raw' context or something similar. This is the reason why template engines have html escaping by default. > One is 3 characters shorter, but that is the sole difference in terms of effort. No. The difference is that you cannot write unsafe code by removing 3 characters. Length of code or function name is not the reason of this RFC, I told this many times. > Huh? Is the word "I" copied in this e-mail, because the English language requires me to write it more than once? And if "e(" is "copied code", how is the "*" in "<?*" not also "copied code"? <?* ?> is one action in source code, <?= e() ?> are 2 actions. This is the same as if you woul need to call constructor manually every time: new MyClass->__construct(). Is it a better code? Maybe let's remove automatic constructor call?) > Twig allows you to register a named "strategy" to a single callable, exactly as I am suggesting: http://twig.sensiolabs.org/doc/filters/escape.html#custom-escapers This is much more useful than a single callback that has to handle all possible strategies. As I understand, the problem is that this is a registry with global state, as Rasmus said. In Twig this is not a global registry, it is stored in object of 'Core' class. And yes, this is a single callback twig_escape_filter(), which handles all possible strategies. First variant of this RFC was a registry. But actually, people don't need a registry, especially with built-in escapers, they ask about an easy way to call escaper (htmlspecialchars() in feature requests). Also, all possible strategies depend on tasks. Even for htmlspecialchars() different set of flags could be used. Let user choose how to escape HTML. This is needed once during application development. > But this could still be done without allowing arbitrary expressions, or embedding syntax inside the strategy argument: > <?*$strategy*html= $text ?> Sorry, I don't understand. Why $strategy is not 'arbitrary expression'? And why it is needed to make so complex parsing logic, which will be the same as html($text, $strategy)? > If they're doing something complex, they can implement their own way of doing it - probably by writing a templating engine, or using one of the many that already exist. There is a possibility to make this with new operator describeed in RFC. It does not require many changes in PHP source code or application source code. Why it is needed to specially restrict its functionality? > So it is now mandatory to have some bootstrap file somewhere that defines and registers the escape function? How is that different from writing, right now, at the top of your bootstrap file: > function e($str, $context = 'html') { ... } It is different, because this function must be called everywhere manually, and when it is missed, this gives a possible XSS vulnerability. New operator is a simple way to automatically call user-defined escapers. >> Complicated syntax like <?*html*js= $str ?>. > I have no idea why that is "complicated syntax", but your proposal isn't: > <?: $str | 'html | js' ?> > Or even: > <?: $str | ['html', 'js'] ?> That is "complicated syntax" because it requires many changes in the syntax parser, more than operator described in RFC. More changes - more complexity. And I don't suggest multiple arguments. > In your proposal, part of the syntax won't even be standard between different people's code There is no aim to invent new global standard. As there is no standards for naming escapers function, they are differs in different people's code. > Is it just that you don't like the escape strategy coming first? I told about flexibility, not about placement. >> I.e. we anyway need to pass context as a second argument, so why not allow user to do it. > Because we're trying to make it easier for the user, not harder. Why restriction is easier? You decide to forgive pass a context as a variable, and user > Why make them handle the nesting, sanity-checking, and control flow of multiple filters, rather than building them into the syntax from the start? Because this fully depends on application, which flags should be passed into htmlspecialchars. So, user must first unregister build-in handler and then register his handler.