> As regards line breaking algorithm, I think we need some more cflags, > at least for Japanese. That is, > > - lines must not be broken before the character > - lines must not be broken after the character > > These seems to be implemented as PRE_KINSOKU and POST_KINSOKU in > jgroff, but it's done by hardcoded. I think this should be done by > tmac.<lang>, so I think it's good idea to have some mechanisms to > load language specific tmac files.
Which mechanism do you suggest? > BTW, what do you think about code name for multibyte character/wide > character or glyph code what you said? In jgroff, it seems it used > wchar<EUCcode>. I suggest that we follow the Adobe Glyph List (AGL): http://partners.adobe.com/asn/developer/typeforum/unicodegn.html This means that CJK glyph names would be uni<Unicode>, e.g. `uni635F'. > jgroff provides "fixedkanji" directive in font description. But, > the code of font description loader depends on EUC<->KuTen mapping, > and it's not good idea for i18n. I think it would be better to > provide "wcharset" directive which support code range. However, > code range couldn't be used with EUC encoding or something like > that, and not used for Unicode, because we couldn't expect character > codes for some language are in succession. It doesn't matter. A range directive like `wcharset' (an ugly name, BTW) just tells troff that glyphs in a given the range have identical metrics. A better name is probably `glyphclass': glyphclass <sample character> <range begin> <range end> Example: glyphclass uni4F34 uni4E00 uni9FFF The sample glyph needs a real metrics entry. > Anyway, jgroff provides new font "M" and "G", which are "Mincho" and > "Gothic" respectively, for wide characters. What is the right way > to add i18n support in groff about font description? Basically, there is nothing to do. The only addition needed is a way to make the font description files smaller, and this is just the proposed `wcharset' (or `glyphclass') command. > > > . Command lines shall be able to override input encoding > > > (--input-encoding). > > Yes. > > How about creating new request (.encoding "encoding-name") in roff > source? Good idea. > Could you explain (or point us which source code in groff) how to map > glyph IDs to output code, please? As explained in a previous mail, a hard-coded mapping table from glyph-names to Unicode output encoding is needed for tty devices. For all other devices, nothing will change. Werner