On Thu, 28 Jul 2022 16:18:51 GMT, Naoto Sato <na...@openjdk.org> wrote:
>> Many thanks @naotoj . >> >> I checked the latest IBM-864 mapping table. >> (I assume current OpenJDK's IBM864 may refer older mapping table) >> https://raw.githubusercontent.com/unicode-org/icu/main/icu4c/source/data/mappings/ibm-864_X110-1999.ucm >> .ucm file format is as follows: >> https://unicode-org.github.io/icu/userguide/conversion/data.html#ucm-file-format >> >> I checked roundtrip mapping >> (Roundtrip entries have `|0` at the end of line) >> | IBM864.map | ibm-864_X110-1999.ucm | >> | --- | --- | >> | 0x1a U+001a | 0x1a U+001c | >> | 0x1c U+001c | 0x1c U+007f | >> | **0x25 U+066a** | **0x25 U+0025** | >> | 0x7f U+007f | 0x7f U+001a | >> | 0x9f U+fffd | 0x9f U+200b | >> | 0xd7 U+fec1 | 0xd7 U+fec3 | >> | 0xd8 U+fec5 | 0xd8 U+fec7 | >> | 0xf1 U+0651 | 0xf1 U+fe7c | >> >> **Note**: 0x1a <-> U+001c / 0x1c <-> U+007f / 0x7f <-> U+001a entries are >> control character rotation for DOS. >> I think it should be ignored. >> >> I think, roundtrip side should be changed. >> 0x25 entry should be U+0025 on IBM864.map >> Add `0x25 U+066a` into IBM864.c2b >> >> Modify test/jdk/sun/nio/cs/mapping/Cp864.b2c for `0025 0025` >> Add `0025 066a` into test/jdk/sun/nio/cs/mapping/Cp864.c2b-irreversible >> >> This issue just for U+0025, but f possible, please add `0x9f, 0xd7, 0xd8, >> 0xf1` entries. > > Thanks for trying it out @takiguc. However, I am not planning to change any > existing mappings because of the obvious compatibility issues. The fix I > proposed is safe because it is additional, which used to be unmappable (thus > turned into a replacement '?'). Hello @naotoj . I checked [JDK-8290488](https://bugs.openjdk.org/browse/JDK-8290488). This issue was tested by Windows 10. I think we need to confirm expected result for b2c side to reporter. I checked MS's 864 via following test program on my Windows 10. >type b2c_1.ps1 param($code, $hex) $h = [string]$hex $enc_r = [Text.Encoding]::GetEncoding([int]$code) [byte[]]$ba = @() for($i = 0; $i -lt $h.length; $i+=2) { $ba += ([System.Convert]::ToInt32($h.SubString($i,2), 16)) } $s = "" $enc_r.GetChars($ba) | foreach {$s += [System.Convert]::ToInt32($_).ToString("X4")} $s >powershell -NoProfile -ExecutionPolicy Unrestricted .\b2c_1.ps1 864 25 0025 Please ignore about 0xD7,0xD8,0xF1 if the target platform is Windows. Note: Test result for c2b side. >type c2b_1.ps1 param($code, $hex) $enc_r = [Text.Encoding]::GetEncoding([int]$code) [char[]]$ca = @() $ca += ([System.Convert]::ToInt32([string]$hex, 16)) $s = "" $enc_r.GetBytes($ca) | foreach {$s += [System.Convert]::ToInt32($_).ToString("X2")} $s >powershell -NoProfile -ExecutionPolicy Unrestricted .\c2b_1.ps1 864 0025 25 >powershell -NoProfile -ExecutionPolicy Unrestricted .\c2b_1.ps1 864 066A 25 ------------- PR: https://git.openjdk.org/jdk/pull/9661