Hi!
> I'm not completely against it. It's just an incomplete solution.
>
> echo "\u{1F602}"; // won't output 😂 if the output encoding is not UTF-8
You can always use iconv/recode to bring it to every encoding you need
(provided it supports full unicode range). I see this as a readability
feature
On Tue, Nov 25, 2014 at 3:20 AM, Alain Williams wrote:
> If we decide to support non-utf-8 encoding at compile time then we could
> extend
> the syntax a bit to allow the encoding to be specified, eg:
>
> \U{utf-8: arabic letter alef}
>
> \U{iso-8859-6: arabic letter alef}
>
God, that's s
On Tue, Nov 25, 2014 at 2:18 PM, Andrea Faulds wrote:
>
> > On 25 Nov 2014, at 10:41, Dmitry Stogov wrote:
> >
> > u8"string" tells that the whole string is UTF-8 encoded.
> > Your escape Unicode proposal assumes just UTF-8 codepoint, but the
> whole string encoding is still undefined.
>
> True
> On 25 Nov 2014, at 11:48, Derick Rethans wrote:
>
> I think "incomplete" nails it on the head. Without "proper" Unicode
> support in the parser, compiler and string function semantics, having
> these escape codes doesn't really do a lot for us.
How so? Why are they less useful because we do
Hi all,
On Tue, Nov 25, 2014 at 8:09 PM, Andrea Faulds wrote:
> non-BMP code points are more important than ever.
Yes, it is! We(Japanese) have number of them already.
\u{code point} has huge advantage. We do not have care if code point value
is BMP or not.
i.e. We can do
echo "\u{code point}
On Tue, 25 Nov 2014, Dmitry Stogov wrote:
> On Tue, Nov 25, 2014 at 1:00 PM, Andrea Faulds wrote:
>
> >
> > > On 25 Nov 2014, at 08:33, Dmitry Stogov wrote:
> > >
> > > May be I misunderstood something, but why to introduce unicode escapes
> > if PHP engine doesn't support Unicode.
> >
> > We d
On Tue, Nov 25, 2014 at 11:25:17AM +, Andrea Faulds wrote:
> Well, we *do* already have a compile-time system for declaring encoding, the
> declare() construct.
I missed that. Reading the documentation I confess that I do not really
understand what the effect of declare(encoding=xxx) does.
Ivan Enderlin @ Hoa wrote:
> Le 24/11/2014 23:09, Andrea Faulds a écrit :
>> Good evening,
>>
>> Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
>>
>> It has a rationale section explaining why certain decisions were made,
>> that I’d recommend you read in full.
> Excellent RFC, thank you
> On 25 Nov 2014, at 11:20, Alain Williams wrote:
>
> I think that we need to clarify what we are talking about.
>
> What Andrea has proposed is a way of writing string constants. These
> characters
> in these strings will still be 8 bits big, this means that there needs to be
> some way of en
On Tue, Nov 25, 2014 at 02:41:48PM +0400, Dmitry Stogov wrote:
> I'm not completely against it. It's just an incomplete solution.
>
> echo "\u{1F602}"; // won't output 😂 if the output encoding is not UTF-8
>
> echo "Привет \u{1F602}"; // won't output anything useful if script
> encoding is not U
> On 25 Nov 2014, at 10:41, Dmitry Stogov wrote:
>
> u8"string" tells that the whole string is UTF-8 encoded.
> Your escape Unicode proposal assumes just UTF-8 codepoint, but the whole
> string encoding is still undefined.
True. There’s an assumption there that you’re using a UTF-8-compatible
> On 25 Nov 2014, at 10:32, Derick Rethans wrote:
>
> On Mon, 24 Nov 2014, Sara Golemon wrote:
>
>> On the BMP versus SMP issue of \u styles, we addressed this in
>> PHP6 by making \u denote 4 hexit BMP codepoints, while \U denoted six
>> hexit codepoints. e.g."\u1234" === "\U001234"
On Tue, Nov 25, 2014 at 1:00 PM, Andrea Faulds wrote:
>
> > On 25 Nov 2014, at 08:33, Dmitry Stogov wrote:
> >
> > May be I misunderstood something, but why to introduce unicode escapes
> if PHP engine doesn't support Unicode.
>
> We don't have Unicode strings which are made of codepoints rather
On Mon, 24 Nov 2014, Sara Golemon wrote:
> On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds wrote:
> > Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
> >
> I'm okay with producing UTF-8 even though our strings are technically
> binary. As you state, UTF-8 is the de-facto encoding, and r
> On 25 Nov 2014, at 08:33, Markus Fischer wrote:
>
>> On 24.11.14 23:09, Andrea Faulds wrote:
>> Good evening,
>>
>> Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
>
> I think the choice of \u{xx} is interesting, i.e. using '{' and '}'.
>
> Afaik, one of the current best practices
> On 25 Nov 2014, at 08:33, Dmitry Stogov wrote:
>
> May be I misunderstood something, but why to introduce unicode escapes if PHP
> engine doesn't support Unicode.
We don't have Unicode strings which are made of codepoints rather than bytes,
sure. But we do usually treat these strings as UTF
On 24.11.14 23:09, Andrea Faulds wrote:
> Good evening,
>
> Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
I think the choice of \u{xx} is interesting, i.e. using '{' and '}'.
Afaik, one of the current best practices is to use json_decode(), like so:
$ cat test.php
http://www.php.net
May be I misunderstood something, but why to introduce unicode escapes if
PHP engine doesn't support Unicode.
Always converting such escapes into UTF-8 encoding, doesn't make any sense
for people who use other encodings for output, databases, etc.
Thanks. Dmitry.
On Tue, Nov 25, 2014 at 1:09
Le 24/11/2014 23:09, Andrea Faulds a écrit :
Good evening,
Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
It has a rationale section explaining why certain decisions were made, that I’d
recommend you read in full.
Excellent RFC, thank you for this proposal.
I would suggest this tal
On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds wrote:
> Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
>
I've linked a provisional HHVM implementation from that page.
Planning to match whatever PHP7 does, of course, but for the moment
I've added named entity support since it's being dis
On Mon, Nov 24, 2014 at 11:36:28PM +, Andrea Faulds wrote:
>
> > On 24 Nov 2014, at 23:29, Alain Williams wrote:
> > echo "\U{arabic letter alef}\n”;
>
> Ooh, that’s an interesting idea. I believe Perl actually has this already,
> although it uses the \N syntax:
>
> http://perldoc.perl.or
> On 24 Nov 2014, at 23:29, Alain Williams wrote:
>
> There is a big difference with \u or \U and \x or \o and that is the number of
> characters that follow the escape. \x has 2, \o has 3 - both are short and
> easy
> to count with the eye. \U012345 is quite long and it is not so visually
> o
On Mon, Nov 24, 2014 at 02:21:37PM -0800, Sara Golemon wrote:
> On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds wrote:
> > Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
> >
> I'm okay with producing UTF-8 even though our strings are technically
> binary. As you state, UTF-8 is the de-f
> On 24 Nov 2014, at 23:19, Sara Golemon wrote:
>
>> We would have to require ICU, but that might be worthwhile for PHP 7
>> anyway. Having at least one i18n API that's guaranteed to be available
>> would be nice.
>>
> It's 2014. I think requiring ICU is reasonable at this point.
I also think
> We would have to require ICU, but that might be worthwhile for PHP 7
> anyway. Having at least one i18n API that's guaranteed to be available
> would be nice.
>
It's 2014. I think requiring ICU is reasonable at this point.
Orthogonal to this RFC, but I'd be in favor of deprecating all the
non-I
On 24 November 2014 at 14:35, Andrea Faulds wrote:
>
>> On 24 Nov 2014, at 22:30, Adam Harvey wrote:
>> I'm also OK with this, although I do wonder if we should be respecting
>> the user's default_charset setting instead. (Since default_charset
>> defaults to "UTF-8", in practice this isn't a sig
> On 24 Nov 2014, at 22:30, Adam Harvey wrote:
>
> On 24 November 2014 at 14:21, Sara Golemon wrote:
>> On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds wrote:
>>> Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
>>>
>> I'm okay with producing UTF-8 even though our strings are technica
On 24 November 2014 at 14:21, Sara Golemon wrote:
> On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds wrote:
>> Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
>>
> I'm okay with producing UTF-8 even though our strings are technically
> binary. As you state, UTF-8 is the de-facto encoding
> On 24 Nov 2014, at 22:21, Sara Golemon wrote:
>
> On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds wrote:
>> Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
>>
> I'm okay with producing UTF-8 even though our strings are technically
> binary. As you state, UTF-8 is the de-facto encod
On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds wrote:
> Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
>
I'm okay with producing UTF-8 even though our strings are technically
binary. As you state, UTF-8 is the de-facto encoding, and recognizing
this is pretty reasonable.
You may want
> On 24 Nov 2014, at 22:09, Andrea Faulds wrote:
>
> Here’s a new RFC: https://wiki.php.net/rfc/unicode_escape
My apologies to you all, a small correction: The title of that email should’ve
been “[RFC] Unicode Codepoint Escape Syntax” to match the title of the RFC, I
missed out the “Codepoint
31 matches
Mail list logo