On Thu, Jul 24, 2003 at 03:55:43PM +0200, Aaron Isotton wrote: > what are man pages, or more generally, groff documents, supposed to be > encoded in? I didn't find any reference to that in groff(7). Is it > ASCII?
See groff_char(7). Technically it's Latin-1, but this is planned to change to UTF-8 for groff 2.0 (no schedule yet); groff_char(7) advises sticking to ASCII, and I agree. You can get everything in Latin-1 using named characters anyway without having to worry about encoding. > The problem arises because I have to transform a Docbook XML document > into a manpage; there, all spaces (ASCII 0x20) inside a <literallayout> > are translated into 0xA0 in the output. I don't know what an A0 is > supposed to be, but man ignores it when generating output, thus > effectively removing the spaces from the output. 0xA0 is the Latin-1 non-breaking space. Bug #199422 notes that this doesn't work in current groff. I'm not sure whether this is actually a groff bug or not, and need to check with upstream. I suggest using '\ ' instead anyway, though. > [Side note: there's also another problem unrelated to this; see > http://sourceforge.net/tracker/?func=detail&aid=763861&group_id=21935&atid=516914 > for more information] Using .nf and .fi would probably be more sensible than large numbers of .br requests. (Feel free to pass on this comment.) Cheers, -- Colin Watson, groff maintainer [EMAIL PROTECTED]