At Wed, 6 Jun 2001 17:02:16 +0200, Radovan Garabik wrote: > > > > utf8 in the current state does not cover everything we had in other > > > > encodings. > > > > > > utf8 is just a _multibyte_ encoding, not _character_ encoding, > > > it can represent whatever character encoding is used in UCS-4 > > > > UCS4 is not a satisfactory encoding for our needs, unfortunately. > > JIS is not comlpete either, but UCS4 is less. > > but: JIS is japanese only, UCS-4 is global > UCS-4 can (and will) be easily expanded, there are no technical > problems in adding characters to this encoding > > can JIS be easily extended to support missing characters? > I do not think so...
First of all, JIS means Japanese Industry Standards, it's not only for character sets/encoding. JIS means many standards for industrial worlds, such as screw size or so. Anyway, in this context, I assume JIS you say is JIS X0208. This is just character sets not encoding. We usually uses JIS X0208 with ASCII in ISO 2022 encoding. When ASCII->G0 and JIS X0208->G1 and G0->GL G1->GR, we call it as EUC-JP (presicely, supplementary character sets is used for G2/G3). In Japanese linux environments, we usually uses EUC-JP, because it's most simplest encoding for Japanese for now. When initially ASCII->G0 and G0->GL, and switch ASCII to JIS X0208 with ESC $ B and switch back with ESC ( B, we call it as JIS 7bit encodig or commonly ISO-2022-JP. We use this encoding for Internet message for Japanese, because it uses only 7bit, so it can be safely passed via non-8bit-clean routes. This is only simple version of ISO 2022, so it can be easily expanded to use other character sets. X Compond Text is the example for more use of ISO 2022. > UCS-4 can, given some effort. Given some effort, ISO-2022 can too. Regards, Fumitoshi UKAI