At 6:03 PM -0600 4/21/04, kj wrote:
Hello folks,

This will be of interest to only a few people, but it will be good to have it in the archives for when we need it.

Here is a list of Korean character sets that represent hangul (Korean symbols) and hanja (Sino-Korean):

- EUC-KR (KSC 5601, renamed to KS X 1001) or Microsoft's superset UHC
- ISO-2022 comes in both -JP and -KR versions.
- johab is a legacy 16-bit encoding, leading bit = 1 + 3 * 5 bits for leading consonant, vowel, optional consonant(s) at the end
http://trade.chonbuk.ac.kr/~leesl/code/johap.gif

Ah, cool. Looks like that stuff's in the O'reilly CJKV book (which I desperately want a second edition of) but that book's a bit slanted towards Chinese and Japanese.


The URL above goes to a useful table for working with johab. I do know it is a legacy charset, but I don't know how much it is still used. Technically, ASCII is legacy, too. :)

Ah, at this point Unicode's legacy too. Besides, as long as RAD-50 lives, nobody's got much standing to call a character set "Legacy" :)


Do we have any local experts on Japanese charsets? If not, I can do a little bit of research there, too.

There, at least, I can get access to folks who've done work, and I can get by enough myself that I'm not too worried.
--
Dan


--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to