On Apr 21, 2004, at 4:52 PM, kj wrote:

- there are (of course) some character sets that don't work well with Unicode -- for example, Big5HKSCS doesn't encode in UCS2 (though I didn't find out why)

UCS-2 is limited--it can only address the BMP (that is, only 2^16 characters). It has been superseded by the UTF-* encodings. (UTF-16 can be thought of as UCS-2 plus surrogate pairs.)


It's my understanding that all of the characters HKSCS-2001 are available in Unicode 4.0 (with 35 rarely-used character being mapped into the private use area).

Necessarily, Unicode lags behind revisions to national standards--it takes time to incorporate the changes--but, so does everything else (inclusion of new characters into fonts, etc.).

JEff

Reply via email to