On Mon, Nov 20, 2000 at 07:25:11PM +0900, Tomohiro KUBOTA wrote: > > BTW, I think GB18030 would be a _character set_, not _encoding_. > If so, we won't have zh_CN.GB18030 locale.
In fact it is both, AFAICT; GB18030 defines the set of characters, and the way to encode them. Just like GBK. > Examples (Japanese): > JIS X 0201, JIS X 0208, JIS X 0212, JIS X 0213 are _character set_. > EUC-JP, Shift-JIS, ISO-2022-JP are _encoding_. > For simplified Chinese: > GB 2312, GB 7589, GB 7590, GB 8565, GB 12052, GBK, are _character set_. > CN-GB (aka EUC-CN), GBK, ISO-2022-CN, are _encoding_. > For traditional Chinese: > BIG5, CNS 11643, are _character set_. > ISO-2022-CN, ISO-2022-CN-EXT, EUC-TW, BIG5, are _encoding. > > Codes which are not ISO2022-compliant tend not to separate > _character set_ and _encoding_. You might want to add HKSCS to that list :p it defines both the set of characters to be used in Hong Kong, and the way to encode them in both Big5 and ISO-10646. (Though as others have pointed out, currently a whole bunch of characters in HKSCS are mapped to both the PUA and plane 2 of ISO-10646 version 2, including some of the expletives widely used in Hong Kong ... :p ) [ regarding PRC Govt's ban of non-GB18030 compliant software ] > How severe! Can a government have such a right? Yup, and the deadline's only a bit more than a month away ... -- Roger So telnet://e-fever.org spacehunt at e-fever dot org SysOp, e-Fever BBS GnuPG 1024D/98FAA0AD F2C3 4136 8FB1 7502 0C0C 01B1 0E59 37AC 98FA A0AD