On Fri, 30 Apr 2021 16:11:30 GMT, Ichiroh Takiguchi <itakigu...@openjdk.org> wrote:
>> When an invalid character is converted by getBytes() method, the character >> is converted to replacement byte data. >> Shift code (SO/SI) may not be added into right place by EBCDIC Mix charset. >> EBCDIC Mix charset encoder is stateful encoder. >> Shift code should be added by switching character set. >> On x-IBM1364, "\u3000\uD800" should be converted to "\x0E\x40\x40\x0F\x6F", >> but "\x0E\x40\x40\x6F\x0F" >> SI is not in right place. >> >> Also ISO2022 related charsets use escape sequence to switch character set. >> But same kind of issue is there. > > Ichiroh Takiguchi has updated the pull request incrementally with one > additional commit since the last revision: > > 8266013: Unexpected replacement character handling on stateful > CharsetEncoder src/java.base/share/classes/java/nio/charset/Charset-X-Coder.java.template line 632: > 630: if (action == CodingErrorAction.REPLACE) { > 631: #if[encoder] > 632: if (maxBytesPerChar > 3.0) { Does this check imply it is for stateful encoder? Since the fix is for incorrect SO/SI handling, should the fix be localized in those EBCDIC/ISO2022 encoders, not in the generic Charset-X-Coder? ------------- PR: https://git.openjdk.java.net/jdk/pull/3719