> On 24 Nov 2014, at 22:30, Adam Harvey <ahar...@php.net> wrote:
> 
> On 24 November 2014 at 14:21, Sara Golemon <poll...@php.net> wrote:
>> On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds <a...@ajf.me> wrote:
>>> Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
>>> 
>> I'm okay with producing UTF-8 even though our strings are technically
>> binary.  As you state, UTF-8 is the de-facto encoding, and recognizing
>> this is pretty reasonable.
> 
> I'm also OK with this, although I do wonder if we should be respecting
> the user's default_charset setting instead. (Since default_charset
> defaults to "UTF-8", in practice this isn't a significant difference
> for the average user.)

Ooh, that would be a possibility. That or using whatever encoding the source 
file is specified to be with declare(), so it matches the encoding of other 
characters in the string.

This’d add significant complexity to it, though (would we have to require ICU 
or something? D:), plus the vast majority of Unicode characters will only be 
supported by Unicode encodings… and of those, only UTF-8 is really in much use 
here anyway.

>> You may want to make it a requirement that strings containing \u
>> escapes are denoted as:   u"blah blah"    We set aside this format
>> back in the PHP6 days (note that b"blah" is equivalent to "blah" for
>> binary strings).
> 
> It seems to me that the point of \u and \U escapes is to embed Unicode
> in potentially non-Unicode strings, so using u"" doesn't feel right.

I don’t really see where you’re coming from, it also makes just as much sense 
within Unicode strings. There are plenty of cases (like the U+202E or mañana 
examples in the RFC) where you’d want a Unicode escape in a Unicode string.

--
Andrea Faulds
http://ajf.me/





--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to