> However, running the script with that doesn't produce exactly what we > have in utf8_to_sjis.map, either. It's otherwise same, but we have > some extra mappings: > > - {0xc2a5, 0x5c},
0xc2a5 is U+00a5. The glyph is "YEN SIGN" which is corresponding to 0x5c in SJIS. So this is a valid mapping. In the mean time, Microsoft wants to map U+005c to 0x5c in CP932. The glyph of U+005c is "REVERSE SOLDIUS" (back slash). So MS decided that the glyph of U+00x5c is "YEN SIGN" in CP932! In summary we need to keep both of mappings: U+00a5 (utf 0xc2a5) -> 0x5c and U+005c -> 0x5c. Obviously this breaks the round trip conversion between UTF8 and SJIS encoding in this case though. > - {0xc2ac, 0x81ca}, U+00ac (NOT SIGN). Exists in SJIS. > - {0xe28096, 0x8161}, U+2016 (DOUBLE VERTICAL LINE). Exists in SJIS. > - {0xe280be, 0x7e}, U+213e (OVERLINE). Mapped to acii 0x7e, which is "half width tilde". > - {0xe28892, 0x817c}, U+2212 (MINUS SIGN). Mapped to "double width minus sign" in SJIS. > - {0xe3809c, 0x8160}, u+301c (WAVE DASH). Mapped to "double width wave dash" in SJIS. > Those mappings were added in commit > a8bd7e1c6e026678019b2f25cffc0a94ce62b24b, back in 2002. The bogus > mapping for the invalid 0xc19c UTF-8 byte sequence was also added by > that commit, as well a few valid mappings that UCS_to_SJIS.pl also > produces. > > I can't judge if those mappings make sense. If we can't find an > authoritative source for them, I suggest that we leave them as they > are, but also hard-code them to UCS_to_SJIS.pl, so that running that > script produces those mappings in utf8_to_sjis.map, even though they > are not present in the CP932.TXT source file. Sounds acceptable. In summary current PostgreSQL UTF8 <--> SJIS mapping is a somewhat mixture of SJIS (Shift_JIS) and MS932. There's no cleaner solution to exodus this situation. I think we need live with it. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers