Hi,

The declare() semantic like,
declare(encoding="Shift_JIS");
 is already supported by mbstring since PHP 4.3.

1.set detect_unicode = Off
2.adding declare(encoding="encoding_name") on the first line of script
can be your solution ?

Rui

On Fri, 07 Sep 2007 14:40:14 -0500
Greg Beaver <[EMAIL PROTECTED]> wrote:

> LAUPRETRE Fran輟is (P) wrote:
> > Hi,
> > 
> >> From: Rui Hirokawa
> >> 
> >> IMHO, #42396 is not a bug, but it is the specification. The normal
> >> script doesn't contain a null byte if it is not encoded in Unicode.
> >> 
> >> 
> >> It is understandable the addition of a unique byte seqence 
> >> '0xFFFFFFFF' detection to support PHAR/PHK, but it is a change to
> >> add a new feature.
> > 
> > Sorry to insist but, since __halt_compiler() was introduced, your
> > assertion is not true any more.
> > 
> > Actually, it depends on what you consider as 'the script' : if you
> > just consider the data from the beginning of the file to the
> > __halt_compiler() directive, that's right: if this data contains a
> > null byte, it is unicode.
> > 
> > But the current unicode detection is not aware of the
> > __halt_compiler() directive, and it scans the whole file. So, your
> > assertion is wrong: it is perfectly legitimate to have a non-unicode
> > script contain null bytes (if they are after an __halt_compiler()
> > directive). So, it is a bug and not a feature request. This side
> > effect was not identified when __halt_compiler() was added.
> > 
> > The obvious solution is to decide that a non-unicode script cannot
> > contain null bytes, even after an __halt_compiler(). It would just
> > require three lines in the PHP doc. But that would introduce a severe
> > limitation and, in practice, would make the __halt_compiler() feature
> > almost useless.
> > 
> > The solution I am proposing is not very elegant but it is the only
> > one I found which does not make __halt_compiler() and multibyte
> > incompatible. As __halt_compiler() was introduced recently, and as,
> > afaict, the only software to use it are PHAR and PHK, I consider it
> > as acceptable, if not perfect.
> > 
> > Greg, Marcus, do you have a better idea ? I considered that unicode
> > detection is done before __halt_compiler() can be detected, do you
> > confirm ?
> 
> unicode detection in mb_string is in fact done before __halt_compiler().
>  I don't think there is a solution to this problem without changes to PHP.
> 
> Fortunately, PHP 6 introduces usage of declare (please correct me if I'm
> wrong) that allows declaration of the file's encoding, which would
> remove the guesswork.
> 
> I think the best thing in this case is to recommend that multibyte
> auto-detection be disabled, and wait for PHP 6 which provides a proper
> solution to the unicode encoding issue.
> 
> Greg
> 


-- 
Rui Hirokawa <[EMAIL PROTECTED]>

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to