Hello developers,

this mail is related to the previous message by Werner LEMBERG where he documented the -K option of the CVS version of groff. This option allows one to specify the input encoding, and one can use construction like the following in order to get a formatted manual page as UTF-8 output:

groff -K input_charset -Tutf8 -mandoc /path/to/manual/page.1

And, if the manual page is ISO-8859-1 encoded, the -K option is not needed.

UTF-8 locale users can stick this into their man.conf and be happy. But what about those who prefer to use old-style 8-bit locales? Groff cannot output ISO-8859-X where X != 1. I tried to model how various non-UTF-8 users would do this, using the English hosts_access.5 manual page from the tcpwrappers package, which should be viewable from any locale.

1) The old -Tlatin1 hack mostly works. But it's still a hack.

2) The following command attempts to convert Groff UTF-8 output to the locale charset:

groff -Tutf8 -mandoc hosts_access.5 | iconv -f UTF-8

but this chokes on the first hyphen. Let's attempt to tell iconv to use the best approximation:

groff -Tutf8 -mandoc hosts_access.5 | iconv -f UTF-8 -t //TRANSLIT

This is better, but still not ideal. Details:

in ISO-8859-1 based locales, everything looks good, but quotes differ from what gets printed with the -Tlatin1 switch.

in ISO-8859-{3,7,8,9,10,13,15,16} and KOI8-* based locales, the bullets become bullets! A huge improvement over the -Tlatin1 hack.

in ISO-8859-{2,4,5,6,11,14} and TIS-620 based locales, the bullets are replaced with question marks. Well, they were not bullets either with the -Tlatin1 hack. But the question mark is simply not right.

So my question is how to avoid this. The answer "use -Tascii for such manual pages" won't be accepted until Man stops using one Groff line for all manual pages: -Tascii damages Polish manuals. The answer "patch glibc so that iconv transliterates the bullet to 'o'" is better (and in fact this is doable), but I think that users of non-Glibc systems (or old Glibc) will complain if this becomes the official answer.

So: what is the official recommendation upon formatting manual pages in non-ISO-8859-1 non-UTF-8 locales with the CVS version of Groff?

--
Alexander E. Patrakov


_______________________________________________
Groff mailing list
Groff@gnu.org
http://lists.gnu.org/mailman/listinfo/groff

Reply via email to