Hello folks,
This will be of interest to only a few people, but it will be good to have it in the archives for when we need it.
Here is a list of Korean character sets that represent hangul (Korean symbols) and hanja (Sino-Korean):
- EUC-KR (KSC 5601, renamed to KS X 1001) or Microsoft's superset UHC
- ISO-2022 comes in both -JP and -KR versions.
- johab is a legacy 16-bit encoding, leading bit = 1 + 3 * 5 bits for leading consonant, vowel, optional consonant(s) at the end
http://trade.chonbuk.ac.kr/~leesl/code/johap.gif
Ah, cool. Looks like that stuff's in the O'reilly CJKV book (which I desperately want a second edition of) but that book's a bit slanted towards Chinese and Japanese.
The URL above goes to a useful table for working with johab. I do know it is a legacy charset, but I don't know how much it is still used. Technically, ASCII is legacy, too. :)
Ah, at this point Unicode's legacy too. Besides, as long as RAD-50 lives, nobody's got much standing to call a character set "Legacy" :)
Do we have any local experts on Japanese charsets? If not, I can do a little bit of research there, too.
There, at least, I can get access to folks who've done work, and I can get by enough myself that I'm not too worried.
--
Dan
--------------------------------------"it's like this"------------------- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk