On Tue, Nov 25, 2014 at 1:00 PM, Andrea Faulds <a...@ajf.me> wrote:

>
> > On 25 Nov 2014, at 08:33, Dmitry Stogov <dmi...@zend.com> wrote:
> >
> > May be I misunderstood something, but why to introduce unicode escapes
> if PHP engine doesn't support Unicode.
>
> We don't have Unicode strings which are made of codepoints rather than
> bytes, sure. But we do usually treat these strings as UTF-8. The idea of
> doing this in a language without Unicode strings isn't new, C/C++ have the
> u8"" syntax for making UTF-8 strings.
>

u8"string" tells that the whole string is UTF-8 encoded.
Your escape Unicode proposal  assumes just UTF-8 codepoint, but the whole
string encoding is still undefined.


>
> > Always converting such escapes into UTF-8 encoding, doesn't make any
> sense for people who use other encodings for output, databases, etc.
>
> If you're using other encodings, why do you want to use a Unicode
> codepoints? Most Unicode codepoints will not supported by another character
> set.
>

Agree, this Unicode escapes are not going to be used for anything except
UTF-8 encoded strings.
I'm not completely against it. It's just an incomplete solution.

echo "\u{1F602}"; // won't output 😂 if the output encoding is not UTF-8

echo "Привет \u{1F602}"; // won't output anything useful if script
encoding is not UTF-8

The second problem present even for European counties that use Windows-1250
codepage.

echo "mañana \u{1F602}"; // won't output anything useful if script
encoding is not UTF-8

Thanks. Dmitry.


>
> --
> Andrea Faulds
> http://ajf.me/
>

Reply via email to