Hi Eddie,

Eddie Kohler wrote:
The U+2028 LINE SEPARATOR and U+2029 PARAGRAPH SEPARATOR
characters are allowed unescaped in JSON strings, but *not* allowed unescaped
in Javascript. This is widely considered a minor wart in the JSON specification.
<https://medium.com/joys-of-javascript/json-js-42a28471221d>

As a result, the JSON_UNESCAPED_UNICODE flag is dangerous to use when
generating HTML. For example, this will generate a Javascript error ("Unexpected
token ILLEGAL") in the user's browser:

```
$x = mb_convert_encoding('&#x2028;', 'UTF-8', 'HTML-ENTITIES');
echo '<script>x = ', json_encode($x, JSON_UNESCAPED_UNICODE), ';</script>';
```

The proposal is for `json_encode(..., JSON_UNESCAPED_UNICODE)` to
escape the U+2028 and U+2029 characters as \u2028 and \u2029. A new flag,
JSON_UNESCAPED_LINE_TERMINATORS, preserves the former behavior.

It's important to note that this change *only* affects the non-default
JSON_UNESCAPED_UNICODE flag.

This sounds reasonable. I'd like to ask, though, does this mean that without that flag, U+2028 and U+2029 are always escaped?

Thanks.
--
Andrea Faulds
https://ajf.me/

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to