Am 21.08.2008 um 18:08 schrieb Rasmus Lerdorf:
David Zülke wrote:
Am 21.08.2008 um 03:34 schrieb William A. Rowe, Jr.:
Stanislav Malyshev wrote:
Hi!
Are there any objections to incorporating bugfix for #43941 (fix
for
how json handles invalid UTF-8 sequences) into 5.2? I had some
requests about it, right now it's only in 5.3+.
Is there the alternative of substituting an unmappable character
FFFD in
place of the invalid sequence? This a a reasonable alternative
behavior
for some less stringent cases.
(Yes, the fix is better than the status quo, but just taking this
a step
further).
I agree, that would be quite reasonable and also more consistent with
how UTF-8 works in other apps (browsers etc).
Well, using browsers as the benchmark here is a bad idea. IE is
absolutely braindead about dealing with illegal UTF-8 chars. It
will accept just about any sequence of bytes as a valid UTF-8 char
which causes all sorts of problems.
I was talking about the common representation of an invalid sequence.
That's the question mark sign you usually see in a browser when the
encoding is incorrect.
According to the Unicode standard, U+FFFD is supposed to be used as
the replacement character instead of simply stripping invalid ones:
Replacement Character. A character used as a substitute for an
uninterpretable character from another encoding. The Unicode
Standard uses U+FFFD replacement character for this function.
says http://unicode.org/glossary/#replacement_character
Rendering software which cannot process a Unicode character
appropriately most often display it as only an open rectangle, or
the Unicode “replacement character” (U+FFFD, �), to indicate
the position of the unrecognized character.
says http://en.wikipedia.org/wiki/Unicode#Standardized_subsets
Also see http://www.fileformat.info/info/unicode/char/fffd/index.htm
As always, I consider sticking to specs good practice, so doing it in
the above case would be wise :)
Hope that helps,
David
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php