On Tue, Nov 25, 2014 at 1:00 PM, Andrea Faulds <a...@ajf.me> wrote: > > > On 25 Nov 2014, at 08:33, Dmitry Stogov <dmi...@zend.com> wrote: > > > > May be I misunderstood something, but why to introduce unicode escapes > if PHP engine doesn't support Unicode. > > We don't have Unicode strings which are made of codepoints rather than > bytes, sure. But we do usually treat these strings as UTF-8. The idea of > doing this in a language without Unicode strings isn't new, C/C++ have the > u8"" syntax for making UTF-8 strings. >
u8"string" tells that the whole string is UTF-8 encoded. Your escape Unicode proposal assumes just UTF-8 codepoint, but the whole string encoding is still undefined. > > > Always converting such escapes into UTF-8 encoding, doesn't make any > sense for people who use other encodings for output, databases, etc. > > If you're using other encodings, why do you want to use a Unicode > codepoints? Most Unicode codepoints will not supported by another character > set. > Agree, this Unicode escapes are not going to be used for anything except UTF-8 encoded strings. I'm not completely against it. It's just an incomplete solution. echo "\u{1F602}"; // won't output 😂 if the output encoding is not UTF-8 echo "Привет \u{1F602}"; // won't output anything useful if script encoding is not UTF-8 The second problem present even for European counties that use Windows-1250 codepage. echo "mañana \u{1F602}"; // won't output anything useful if script encoding is not UTF-8 Thanks. Dmitry. > > -- > Andrea Faulds > http://ajf.me/ >