Hello Rui,

Sunday, June 29, 2008, 3:36:58 PM, you wrote:


> Removing multibyte encoding support from PHP 5.3 will cause
> the severe incompatibility problem with the older PHP 5.x.

> As  Stefan noted, Shift_JIS character encoding which is widely used in
> Japan is not flex safe encoding because it includes 0x5c (backslash) as
> second byte of a multibyte character. BIG5 character encoding used by
> Chinese is also non flex safe encoding.

> Today, I committed a patch for zend multibyte support into PHP_5_3.
> It is still in experimental staus because I am not an expert of re2c/flex.

> A couple of test scripts is already existing in
> Zend/tests/multibyte/*.phpt, but, of course, we need more test scripts
> for zend multibute.
> (we need to have TestFesta in Japan :)   )

> The script encoding is specified by a couple of different ways.

>    (1) mbstinrg.script_encoding in php.ini 
>    (2) declare(encopding="Shift_JIS") on each PHP script
>      ->  multibyte_encoding_001.phpt
>    (3) BOM in Unicode script
>       -> multibyte_encoding_00[23].phpt
>    (4) auto detection based on mbstring.language,mbstring.detect_order

> The test scripts are already existing for (2),(3), but nothing for
> (1),(4).

> I already confirmed my patch for PHP 5.3 is working for (1),(2) 
> for Shift_JIS encoding. But, I didn't confirmed yet for Unicode BOM
> and other encodings.

Could you put your confirmation of (1) into a test? And is there any
detection functionality missing to write those tests?

> We need to have more test scripts to maintain the reliability, 
> to minimize security risks.

> Rui

> On Tue, 24 Jun 2008 16:21:33 +0200
> Stefan Esser <[EMAIL PROTECTED]> wrote:

>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>> 
>> 
>> >> This is used when reading scripts that are in encodings like Shift-JIS 
>> >> which is very common in Japan. In any case, I have tried to get 
>> >> involvement from some people I know over there without much success.
>> > 
>> > I've asked around a bit as well with our customers/partners, and all 
>> > they seem to answer is "we simply use UTF-8".
>> 
>> It is very unlikely that anyone on internals uses Shift-JIS (EUC-xx).
>> Mainly because (nearly) noone here is Japanese (Korean, Chinese).
>> 
>> However google for phpinfo() and you will see that zend_multibyte is
>> compiled into several PHP servers. You can also google for Shift-JIS and
>>   co...
>> 
>> The problem here is that newer Asian systems will use UTF-8 (except
>> those nations using characters not possible in utf-8) and therefore the
>> customers of the PHP developers (on this list) will not need that
>> support. However there are many legacy systems out there who depend on
>> this feature. They most probably don't know about this discussion or
>> internals at all, so they cannot speak up.
>> 
>> If PHP 5.3 drops this feature it might close some multibyte security
>> problems. However this also means that all those
>> Japanese/Chinese/Korean/Taiwanese/... multibyte scripts will not run
>> anymore. This forces systems to stay on PHP 5.2 which will most probably
>> don't get security updates once PHP 5.3 is out of the door.
>> 
>> Stefan Esser
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.8 (Darwin)
>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>> 
>> iEYEARECAAYFAkhhAu0ACgkQSuF5XhWr2njCswCcDCyWnFi4jInpX+BPhmSp6ec7
>> pAEAoKfDzhhpFKifgwlsn99WMwkve5bp
>> =2qIJ
>> -----END PGP SIGNATURE-----
>> 
>> -- 
>> PHP Internals - PHP Runtime Development Mailing List
>> To unsubscribe, visit: http://www.php.net/unsub.php

> -- 
> Rui Hirokawa <[EMAIL PROTECTED]>





Best regards,
 Marcus


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to