ID:               43957
 Updated by:       [EMAIL PROTECTED]
 Reported By:      [EMAIL PROTECTED]
-Status:           Assigned
+Status:           Closed
 Bug Type:         Unknown/Other Function
 Operating System: linux debian 4.0
 PHP Version:      5.2.5
 Assigned To:      rasmus
 New Comment:

Fixed in CVS


Previous Comments:
------------------------------------------------------------------------

[2008-01-29 21:59:56] [EMAIL PROTECTED]

I see the bug in the code.  I still think this function needs to die,
but I guess we have to continue supporting it.  I'll fix it.

------------------------------------------------------------------------

[2008-01-29 02:35:03] [EMAIL PROTECTED]

Description:
------------
utf8_decode() outputs a random character when supplied with bad input.

When invalid sequences are added, utf8_decoded() usually replace the
sequence with the character "?". But when a lonely highbit character is
present in the end the output seem to be a random character.


Reproduce code:
---------------
for($a=0;$a<20;$a++)printf("%02x ",utf8_decode(chr(0xE0)));


Expected result:
----------------
3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f 3f

(utf8_decode() returns a question mark)

Actual result:
--------------
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
or
09 09 09 09 09 09 09 09 09 09 09 09 09 09 09 09 09 09 09 09
or
05 05 05 05 05 05 05 05 05 05 05 05 05 05 05 05 05 05 05 05
or
02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02
or some other random value

It seem to differ more with individual runs:

$ for a in `seq 1 20`; do php -r 'printf("%02x
",utf8_decode(chr(0xE0)));';
done
08 00 00 02 00 00 00 00 00 05 00 00 00 05 00 00 07 00 09 00


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=43957&edit=1

Reply via email to