On Mon, 24 Nov 2014, Sara Golemon wrote: > On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds <a...@ajf.me> wrote: > > Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape > > > I'm okay with producing UTF-8 even though our strings are technically > binary. As you state, UTF-8 is the de-facto encoding, and recognizing > this is pretty reasonable. > > You may want to make it a requirement that strings containing \u > escapes are denoted as: u"blah blah" We set aside this format > back in the PHP6 days (note that b"blah" is equivalent to "blah" for > binary strings). > > On the BMP versus SMP issue of \uXXXX styles, we addressed this in > PHP6 by making \u denote 4 hexit BMP codepoints, while \U denoted six > hexit codepoints. e.g. "\u1234" === "\U001234" I'd rather > follow this style than making \u special and different from hex and > octal notations by using braces.
I agree with this fully. No need to reinvent a wheel (that we left behind on the road)... cheers, Derick
-- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php