Re: [PHP-DEV] What is the use of "unicode.semantics" in PHP 6?

Richard Lynch Wed, 11 Jul 2007 18:13:52 -0700

On Mon, July 9, 2007 3:07 am, Tomas Kuliavas wrote:
>>>>> Unicode code points can be defined with \u, but PHP6 breaks
>>>>> existing  octal and hex escape sequences.
>>>
>>> I don't understand what this means...
>>
>> I think I know...
>>
>> I have code like this, somewhere:
>>
>> if (preg_match("|[\xF0-\xFF]|", $data)){
>>   $data = un_microsuck($data);
>> }
>>
>> un_microsuck() basically detects and converts any of the goof-ball
>> extended ASCII from MS products (Word, Outlook, etc) to an HTML
>> equivalent character.
>>
>> But now \xF0 isn't going to be ASCII 128 anymore, is it?
>
> \xF0 never was ASCII. ASCII (ISO-646) is 7bit character set. \xF0 is
> decimal 240. It is 8bit.


Don't tell me.

Tell Microsoft.

Cuz I sure as heck get a LOT of input data >> \x7f and I have to do
something reasonable with it...

And I did say "extended ASCII" in the other paragraph, after all...

>> Or maybe \xF0 will "work" but the octal \360 won't?
>
> Are you sure that you can't do that by setting
> unicode.something_encoding
> to iso-8859-1 or windows-1252?

I dunno.

Doesn't really matter if I can't set those in .htaccess, that's for sure.

[joke type="semi"]
All this working going into Unicode, and nobody is pushing to replace
(CR|CRLF|LF) with a new Unicode all-platform newline character?
[/joke]

-- 
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] What is the use of "unicode.semantics" in PHP 6?

Reply via email to