Hi Gustavo,

Thanks for reply.

As long as bison didn't understand multibyte chars, parser would not
work well with them.
Your reply is exactly what I expected.

Thank you for clarification.

--
Yasuo Ohgaki
yohg...@ohgaki.net



On Thu, Nov 3, 2011 at 8:07 PM, Gustavo Lopes <glo...@nebm.ist.utl.pt> wrote:
> Em Thu, 03 Nov 2011 10:31:47 -0000, Yasuo Ohgaki <yohg...@ohgaki.net>
> escreveu:
>
>> One last quick question.
>> Zend/tests/multibyte/multibyte_encoding_001.phpt sets
>> mbstring.internal_encoding=SJIS.
>>
>> Does PHP 5.4+ suppose to work with SJIS(or other similar encoding)
>> internal_encoding?
>>
>
> No. What matters is that the parser generated by bison is able to recognize
> the tokens. In an ASCII (as opposed to EBCDIC) machine, this means the
> encoding must be ASCII compatible.
>
> This is the table for SJIS:
>  http://icu-project.org/icu-bin/convexp?conv=ibm-943_P15A-2003&s=ALL
>
> It would appear that it was ASCII compatible – \x20-\x7E represent
> U+0020-U+007E, but if you take a closer look you'll see that these bytes can
> also appear as part of larger sequences.
>
> For instance, in this script:
>
> <?php
> function a漾() {}
>
> the character 漾 is represented with  \xE0\x40, where \x40 represents @ in
> ASCII, so this would give an error, the same this would give an error:
>
> <?php
> function aà@() {}
>
> would give an error. In fact, If I save the first script as UTF-8 and then
> run PHP:
>
> $ ./php -d zend.multibyte=1 -d zend.script_encoding=UTF-8 -d
> mbstring.internal_encoding=SJIS sjis.php
> php: Zend/zend_language_scanner.l:126: encoding_filter_script_to_internal:
> Assertion `internal_encoding &&
> zend_multibyte_check_lexer_compatibility(internal_encoding)' failed.
> Aborted
>
> it gives an assertion error.
>
> --
> Gustavo Lopes
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to