Hi, The declare() semantic like, declare(encoding="Shift_JIS"); is already supported by mbstring since PHP 4.3.
1.set detect_unicode = Off 2.adding declare(encoding="encoding_name") on the first line of script can be your solution ? Rui On Fri, 07 Sep 2007 14:40:14 -0500 Greg Beaver <[EMAIL PROTECTED]> wrote: > LAUPRETRE Franč¼is (P) wrote: > > Hi, > > > >> From: Rui Hirokawa > >> > >> IMHO, #42396 is not a bug, but it is the specification. The normal > >> script doesn't contain a null byte if it is not encoded in Unicode. > >> > >> > >> It is understandable the addition of a unique byte seqence > >> '0xFFFFFFFF' detection to support PHAR/PHK, but it is a change to > >> add a new feature. > > > > Sorry to insist but, since __halt_compiler() was introduced, your > > assertion is not true any more. > > > > Actually, it depends on what you consider as 'the script' : if you > > just consider the data from the beginning of the file to the > > __halt_compiler() directive, that's right: if this data contains a > > null byte, it is unicode. > > > > But the current unicode detection is not aware of the > > __halt_compiler() directive, and it scans the whole file. So, your > > assertion is wrong: it is perfectly legitimate to have a non-unicode > > script contain null bytes (if they are after an __halt_compiler() > > directive). So, it is a bug and not a feature request. This side > > effect was not identified when __halt_compiler() was added. > > > > The obvious solution is to decide that a non-unicode script cannot > > contain null bytes, even after an __halt_compiler(). It would just > > require three lines in the PHP doc. But that would introduce a severe > > limitation and, in practice, would make the __halt_compiler() feature > > almost useless. > > > > The solution I am proposing is not very elegant but it is the only > > one I found which does not make __halt_compiler() and multibyte > > incompatible. As __halt_compiler() was introduced recently, and as, > > afaict, the only software to use it are PHAR and PHK, I consider it > > as acceptable, if not perfect. > > > > Greg, Marcus, do you have a better idea ? I considered that unicode > > detection is done before __halt_compiler() can be detected, do you > > confirm ? > > unicode detection in mb_string is in fact done before __halt_compiler(). > I don't think there is a solution to this problem without changes to PHP. > > Fortunately, PHP 6 introduces usage of declare (please correct me if I'm > wrong) that allows declaration of the file's encoding, which would > remove the guesswork. > > I think the best thing in this case is to recommend that multibyte > auto-detection be disabled, and wait for PHP 6 which provides a proper > solution to the unicode encoding issue. > > Greg > -- Rui Hirokawa <[EMAIL PROTECTED]> -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php