Re: [HACKERS] Extensions, patch v20 (bitrot fixes)

Martijn van Oosterhout Mon, 20 Dec 2010 15:05:33 -0800

On Mon, Dec 20, 2010 at 10:15:56PM +0100, Nicolas Barbier wrote:
> >From 
> ><URL:http://en.wikipedia.org/wiki/Japanese_language_and_computers#Character_encodings>:
> 
> "Unicode is supposed to solve all encoding problems in all languages
> of the world. [..] There are still controversies. For Japanese, the
> kanji characters have been unified with Chinese, that is a character
> considered to be the same in both Japanese and Chinese have been given
> one and the same code number in Unicode, even if they look a little
> different. This process, called Han unification, has caused
> controversy."


From http://en.wikipedia.org/wiki/CJK_Unified_Ideographs:

"However, the source separation rule states that characters encoded
separately in an earlier character set would remain separate in the new
Unicode encoding."

From all the references I've seen this has been applied everywhere and
any failures to roundtrip conversions are considered bugs and I can't
believe that at this point they havn't all been fixed. This is kind of
underscored by the fact that references always point to theoretical
problems rather than actual lists of characters that can't be
converted.

ISTM that since all the mapping tables are public it should be a SMOP
to *prove* roundtrip conversions are safe, or identify the problems.

Have a nice day,
-- 
Martijn van Oosterhout   <[email protected]>   http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first. 
>                                       - Charles de Gaulle

signature.asc
Description: Digital signature

Re: [HACKERS] Extensions, patch v20 (bitrot fixes)

Reply via email to