Hello Rui, Sunday, June 29, 2008, 3:36:58 PM, you wrote:
> Removing multibyte encoding support from PHP 5.3 will cause > the severe incompatibility problem with the older PHP 5.x. > As Stefan noted, Shift_JIS character encoding which is widely used in > Japan is not flex safe encoding because it includes 0x5c (backslash) as > second byte of a multibyte character. BIG5 character encoding used by > Chinese is also non flex safe encoding. > Today, I committed a patch for zend multibyte support into PHP_5_3. > It is still in experimental staus because I am not an expert of re2c/flex. > A couple of test scripts is already existing in > Zend/tests/multibyte/*.phpt, but, of course, we need more test scripts > for zend multibute. > (we need to have TestFesta in Japan :) ) > The script encoding is specified by a couple of different ways. > (1) mbstinrg.script_encoding in php.ini > (2) declare(encopding="Shift_JIS") on each PHP script > -> multibyte_encoding_001.phpt > (3) BOM in Unicode script > -> multibyte_encoding_00[23].phpt > (4) auto detection based on mbstring.language,mbstring.detect_order > The test scripts are already existing for (2),(3), but nothing for > (1),(4). > I already confirmed my patch for PHP 5.3 is working for (1),(2) > for Shift_JIS encoding. But, I didn't confirmed yet for Unicode BOM > and other encodings. Could you put your confirmation of (1) into a test? And is there any detection functionality missing to write those tests? > We need to have more test scripts to maintain the reliability, > to minimize security risks. > Rui > On Tue, 24 Jun 2008 16:21:33 +0200 > Stefan Esser <[EMAIL PROTECTED]> wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> >> >> This is used when reading scripts that are in encodings like Shift-JIS >> >> which is very common in Japan. In any case, I have tried to get >> >> involvement from some people I know over there without much success. >> > >> > I've asked around a bit as well with our customers/partners, and all >> > they seem to answer is "we simply use UTF-8". >> >> It is very unlikely that anyone on internals uses Shift-JIS (EUC-xx). >> Mainly because (nearly) noone here is Japanese (Korean, Chinese). >> >> However google for phpinfo() and you will see that zend_multibyte is >> compiled into several PHP servers. You can also google for Shift-JIS and >> co... >> >> The problem here is that newer Asian systems will use UTF-8 (except >> those nations using characters not possible in utf-8) and therefore the >> customers of the PHP developers (on this list) will not need that >> support. However there are many legacy systems out there who depend on >> this feature. They most probably don't know about this discussion or >> internals at all, so they cannot speak up. >> >> If PHP 5.3 drops this feature it might close some multibyte security >> problems. However this also means that all those >> Japanese/Chinese/Korean/Taiwanese/... multibyte scripts will not run >> anymore. This forces systems to stay on PHP 5.2 which will most probably >> don't get security updates once PHP 5.3 is out of the door. >> >> Stefan Esser >> -----BEGIN PGP SIGNATURE----- >> Version: GnuPG v1.4.8 (Darwin) >> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org >> >> iEYEARECAAYFAkhhAu0ACgkQSuF5XhWr2njCswCcDCyWnFi4jInpX+BPhmSp6ec7 >> pAEAoKfDzhhpFKifgwlsn99WMwkve5bp >> =2qIJ >> -----END PGP SIGNATURE----- >> >> -- >> PHP Internals - PHP Runtime Development Mailing List >> To unsubscribe, visit: http://www.php.net/unsub.php > -- > Rui Hirokawa <[EMAIL PROTECTED]> Best regards, Marcus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php