From: masakielastic at gmail dot com Operating system: All PHP version: 5.5.0 Package: mbstring related Bug Type: Feature/Change Request Bug description:new function for replacing ill-formd byte sequences with substitute characters
Description: ------------ New function for replacing ill-formd byte sequences with substitute characters is needed. The problem using mb_convert_encoding for that purpose is that the function name doesn't represent the intent.Specfying same encoding twice is verbose and can be interpreted as meaningless conversion for the beginners. $str = mb_convert_encoding($str, 'UTF-8', 'UTF-8'); The case study can be seen in Ruby. Ruby 2.1 introduces String#scrub. http://bugs.ruby-lang.org/issues/6752 https://github.com/ruby/ruby/blob/1e8a05c1dfee94db9b6b825097e1d192ad32930a/strin g.c#L7770-L7783 The debate whether the substitute character can be specified or not is needed. function mb_scrub($str, $encoding = '', $substitute = '') { if ('' === $encoding) { $encoding = mb_internal_encoding(); } if ('' === $substutute) { $ret = mb_convert_encoding($str, $encoding, $encoding); } else { $before_substitute = mb_substitute_character(); mb_substitute_character($substitute); $ret = mb_convert_encoding($str, $encoding, $encoding); mb_substitute_character($before_substitute); } return $ret; } This discussion can be applied to Uconverter. function uconverter_scrub($str, $encoding, $opts = '') { if ('' === $opts) { return UConverter::transcode($str, $encoding, $encoding, $opts); } else { return UConverter::transcode($str, $encoding, $encoding); } } The discussion for standard string functions and filter functions may be needed since htmlspecialchars can be used for that purpose. function str_scrub($str, $encoding = 'UTF-8') { return htmlspecialchars_decode(htmlspecialchars($str, ENT_SUBSTITUTE, $encoding)); } -- Edit bug report at https://bugs.php.net/bug.php?id=65081&edit=1 -- Try a snapshot (PHP 5.4): https://bugs.php.net/fix.php?id=65081&r=trysnapshot54 Try a snapshot (PHP 5.3): https://bugs.php.net/fix.php?id=65081&r=trysnapshot53 Try a snapshot (trunk): https://bugs.php.net/fix.php?id=65081&r=trysnapshottrunk Fixed in SVN: https://bugs.php.net/fix.php?id=65081&r=fixed Fixed in release: https://bugs.php.net/fix.php?id=65081&r=alreadyfixed Need backtrace: https://bugs.php.net/fix.php?id=65081&r=needtrace Need Reproduce Script: https://bugs.php.net/fix.php?id=65081&r=needscript Try newer version: https://bugs.php.net/fix.php?id=65081&r=oldversion Not developer issue: https://bugs.php.net/fix.php?id=65081&r=support Expected behavior: https://bugs.php.net/fix.php?id=65081&r=notwrong Not enough info: https://bugs.php.net/fix.php?id=65081&r=notenoughinfo Submitted twice: https://bugs.php.net/fix.php?id=65081&r=submittedtwice register_globals: https://bugs.php.net/fix.php?id=65081&r=globals PHP 4 support discontinued: https://bugs.php.net/fix.php?id=65081&r=php4 Daylight Savings: https://bugs.php.net/fix.php?id=65081&r=dst IIS Stability: https://bugs.php.net/fix.php?id=65081&r=isapi Install GNU Sed: https://bugs.php.net/fix.php?id=65081&r=gnused Floating point limitations: https://bugs.php.net/fix.php?id=65081&r=float No Zend Extensions: https://bugs.php.net/fix.php?id=65081&r=nozend MySQL Configuration Error: https://bugs.php.net/fix.php?id=65081&r=mysqlcfg