> However, thank you for explaining glyph. I also understand you > understand problems on Japanese character codes well.
Well, I'm the author of the CJK package for LaTeX, I've written a ttf2pk converter, and I'm a member of the FreeType core team :-) > Note that CJK ideographs also has distinction between character and > glyph. The most famous example is two variants of a 'tall or high' > character. Japanese people regard these two as the same in daily > use but Japanese people regard these two as different if they are > used in person's names or so on. I know these problems too well -- AFAIK, in JIS X 0208 these two variants are unified. Do you know details about the new JIS X 0213 standard? > I don't know how Chinese and Korean people treat them. It may be > different. However, IMHO, we should neglect this problem now since > there are so far no standard to treat these variants properly. > Though it is important, it is not in our scope. If you are working on a terminal you need a character set which distinguishes the two forms. > > A `glyph code' is just an arbitrary registration number for a glyph > > specified in the font definition file. > > Then the 'font definition file' will be irrationally large. I think > at least CJK ideographics and Korean precompiled Hanguls have to be > treated in different way. (Ukai has already pointed this problem. > jgroff uses 'wchar<EUCcode>' for glyph names of Japanese > characters.) Right. I think I've answered this problem in my last mail (regarding a `glyphclass' directive in font description files). > A problem. When compiled within internationalized OS, the names for > encodings (for iconv(3) and so on) is implementation-dependent (You > know, there are many implementation-dependent items in standard > C/C++ language). A solution is: we can have a hard-coded > translation table between implementation-dependent encoding names > and macro names for -m. The table must be changed by OS (by > './configure' script or so). A minimal table will be translate > every implementation-dependent encoding names into 'ascii' macro, > since almost encodings in the world are superset of ASCII. A full > table for a OS will cover the list generated by 'iconv --list'. I don't think so. For example, we could restrict to MIME character set tags which are standardized. > Since the '-m' option is generated by groff and passed to troff, > groff has to have '#ifdef I18N' code. (or, the code can be > integrated to the preprocessor if we design the preprocessor to > invoke troff.) Indeed, the default behaviour should be that the preprocessor adds a .mso tmac.<charset> line or something similar to the document, but there must be a possibility to override it manually. Werner