On Mon, Dec 20, 2010 at 10:15:56PM +0100, Nicolas Barbier wrote: > >From > ><URL:http://en.wikipedia.org/wiki/Japanese_language_and_computers#Character_encodings>: > > "Unicode is supposed to solve all encoding problems in all languages > of the world. [..] There are still controversies. For Japanese, the > kanji characters have been unified with Chinese, that is a character > considered to be the same in both Japanese and Chinese have been given > one and the same code number in Unicode, even if they look a little > different. This process, called Han unification, has caused > controversy."
From http://en.wikipedia.org/wiki/CJK_Unified_Ideographs: "However, the source separation rule states that characters encoded separately in an earlier character set would remain separate in the new Unicode encoding." From all the references I've seen this has been applied everywhere and any failures to roundtrip conversions are considered bugs and I can't believe that at this point they havn't all been fixed. This is kind of underscored by the fact that references always point to theoretical problems rather than actual lists of characters that can't be converted. ISTM that since all the mapping tables are public it should be a SMOP to *prove* roundtrip conversions are safe, or identify the problems. Have a nice day, -- Martijn van Oosterhout <klep...@svana.org> http://svana.org/kleptog/ > Patriotism is when love of your own people comes first; nationalism, > when hate for people other than your own comes first. > - Charles de Gaulle
signature.asc
Description: Digital signature