From: narzeczony at zabuchy dot net
Operating system: Linux, Windows
PHP version: 5.0.5
PHP Bug Type: mbstring related
Bug description: mb_convert_encoding - wrong convertion from UTF-16 (problem
with BOM)
Description:
------------
When converting from UTF-16 (to ISO-8859-1 for example) BOM section (2
first bytes of UTF-16 text) should be removed, while mb_convert_encoding
function is trying to convert them.
Problem is similar to bug #22108 but maybe this one can be fixed.
Reproduce code:
---------------
$iso_8859_1 = 'Nexor';
$utf16LE = mb_convert_encoding($iso_8859_1,'UTF-16LE','ISO-8859-1');
$utf16BE = mb_convert_encoding($iso_8859_1,'UTF-16BE','ISO-8859-1');
//lets convert both to UTF-16
//the only difference is 2 byte long BOM field added at the beggining
// \xFF\xFE for little endian
$utf16LE = "\xFF\xFE".$utf16LE;
foreach (str_split($utf16LE) as $l) {echo ord($l).' ';}
echo ' --> ';
$utf16LE2iso = mb_convert_encoding($utf16LE,'ISO-8859-1','UTF-16');
var_dump($utf16LE2iso);
echo '<br/>';
// \xFE\xFF for big endian
$utf16BE = "\xFE\xFF".$utf16BE;
foreach (str_split($utf16BE) as $l) {echo ord($l).' ';}
echo ' --> ';
$utf16BE2iso = mb_convert_encoding($utf16BE,'ISO-8859-1','UTF-16');
var_dump($utf16BE2iso);
Expected result:
----------------
255 254 78 0 101 0 120 0 111 0 114 0 --> string(5) "Nexor"
254 255 0 78 0 101 0 120 0 111 0 114 --> string(5) "Nexor"
Actual result:
--------------
255 254 78 0 101 0 120 0 111 0 114 0 --> string(6) "??exor"
254 255 0 78 0 101 0 120 0 111 0 114 --> string(6) "?Nexor"
--
Edit bug report at http://bugs.php.net/?id=34776&edit=1
--
Try a CVS snapshot (php4): http://bugs.php.net/fix.php?id=34776&r=trysnapshot4
Try a CVS snapshot (php5.0):
http://bugs.php.net/fix.php?id=34776&r=trysnapshot50
Try a CVS snapshot (php5.1):
http://bugs.php.net/fix.php?id=34776&r=trysnapshot51
Fixed in CVS: http://bugs.php.net/fix.php?id=34776&r=fixedcvs
Fixed in release: http://bugs.php.net/fix.php?id=34776&r=alreadyfixed
Need backtrace: http://bugs.php.net/fix.php?id=34776&r=needtrace
Need Reproduce Script: http://bugs.php.net/fix.php?id=34776&r=needscript
Try newer version: http://bugs.php.net/fix.php?id=34776&r=oldversion
Not developer issue: http://bugs.php.net/fix.php?id=34776&r=support
Expected behavior: http://bugs.php.net/fix.php?id=34776&r=notwrong
Not enough info:
http://bugs.php.net/fix.php?id=34776&r=notenoughinfo
Submitted twice:
http://bugs.php.net/fix.php?id=34776&r=submittedtwice
register_globals: http://bugs.php.net/fix.php?id=34776&r=globals
PHP 3 support discontinued: http://bugs.php.net/fix.php?id=34776&r=php3
Daylight Savings: http://bugs.php.net/fix.php?id=34776&r=dst
IIS Stability: http://bugs.php.net/fix.php?id=34776&r=isapi
Install GNU Sed: http://bugs.php.net/fix.php?id=34776&r=gnused
Floating point limitations: http://bugs.php.net/fix.php?id=34776&r=float
No Zend Extensions: http://bugs.php.net/fix.php?id=34776&r=nozend
MySQL Configuration Error: http://bugs.php.net/fix.php?id=34776&r=mysqlcfg