Hi, At Mon, 23 Oct 2000 09:42:18 +0200 (CEST), Werner LEMBERG <[EMAIL PROTECTED]> wrote:
> > - hard-coded converter from Latin1, EBCDIC, and UTF-8 to UTF-8 > > - locale-sensible converter from any encodings supported by OS to UTF-8 > > (note: UTF-8 has to be supported by iconv(3) ) > > May I suggest that you temporarily implement a hack so that you can > use it with the Japanese patch of groff? I don't know how long it > will take until the necessary changes for gtroff have been > implemented. What do you think about the merit of preprocessor with the current Groff which doesn't recognize UTF-8 input? I think the preprocessor can contribute Groff to be locale-sensible. However, groff wrapper or troff will need some mechanism to receive a report on locale from the preprocessor. The algorithm will be: check locale and use - -Tlatin1 for Latin-1 languages - -Tnippon for Japanese - -Tascii8 for other languages if groff wrapper is invoked with -Ttty. (IMO, we should not override user's specification of -Tlatin1, -Tascii, -Tnippon, and so on). BTW, do you plan to release Groff with Japanese patch, with my preprocessor, as a makeshift until Groff with UTF-8 input will be available? (I thought so since you seem to be interested in my preprocessor working with Japanese-patched Groff. :-) > > BTW, besides TTY output, HTML will need postprocess from glyph to > > character like 'grotty' in tty mode, since HTML is a text file. > > Yes and no. HTML output also supports entities with the &...; > directive. Either (UTF-8 or &...;) will be OK. Eigher have their own merits and demerits. HTML output will be a ASCII text with &...; . ASCII is the most portable character set/encoding in the world. However, reading HTML source with &...; will be hard if the most part of the text consists from non-ASCII characters, such as Japanese, Russian, and Greek. --- Tomohiro KUBOTA <[EMAIL PROTECTED]> http://surfchem0.riken.go.jp/~kubota/