On 09/12/14 02:44, Andrea Faulds wrote: >> Maybe there should be more elaboration on why PHP itself should go with >> > the \u{xxxx} ECMAScript representaton, thus introducing a syntax disparity >> > with our most major string handling extension. > Well, PCRE does what it does probably because of its name: *Perl-Compatible* > Regular Expressions. Perl has the \x syntax. But PCRE’s syntax comes from > what suits Perl, not PHP, so I don’t see why we should necessarily match its > behaviour. If we add \x{xxxxx} syntax to PHP’s string literals, then we’ll > break existing code which uses double quoted strings for regular expressions. > > I think \x{xxxx} is misleading anyway - \xXX is always single-byte/character, > yet Unicode code points can’t be represented in PHP strings as single bytes > when encoded in UTF-8 (unless they’re below U+0100, of course). If I saw > "\x{abcd}” I'd expect it to be the same as "\xab\xbc”. Plus, while Perl has > \x{xxxx} syntax, Ruby and ECMAScript 6 have the \u{xxxx} syntax, so \u{xxxx} > is already more popular. The ‘u’ in \u{xxxx} also makes it more obviously > “Unicode”.
If ICU is to be adopted as the base for unicode support, then surely everything else should follow those rules? \uhhhh and \Uhhhhhhhh are defined along with \x{hhhhhh} so does it make sense to add something which is not part of ICU? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php