Hi Ralph, Ralph Corderoy wrote on Mon, Feb 18, 2019 at 12:23:00PM +0000: > Ingo Schwarze wrote: >> Somebody (i think Ralph) wrote:
>>> Due to some, all?, man renderers trying to keep a shell backquote as >>> a paste-able backquote, for example. >>> >>> .\" For UTF-8, map some characters conservatively for the sake >>> .\" of easy cut and paste. >>> . >>> .if '\*[.T]'utf8' \{\ >>> . rchar \- - ' ` >>> . >>> . char \- \N'45' >>> . char - \N'45' >>> . char ' \N'39' >>> . char ` \N'96' >>> .\} >> Exactly. Which reinforces my point that you have to use \(oq to get a >> left single quote in man(7). > But is that because the `.char' above are hiding faults in man pages > rather than leaving the pressure there for them to be fixed upstream? > The man page source is troff and so `' should be usable in English > prose. Arguably, yes and yes. I certainly agree that in roff(7) input in general, writing a single-quoted string in English prose as `foo' is valid markup and, to make an even stronger statement, marking it up that way is good roff(7) style. > The more noisy escapes should only be needed for the odd bit of > verbatim computer reproduction. I'm not fully convinced i agree with that, for the specific document class of manual pages. On the one hand, in manual pages, verbatim quotations of computer code are not "odd bits", but a very common and important task. They should be easy to write for manual page authors, and requiring authors to wade through source code samples they put into manual pages and encode every instance of ` as \(ga implies a burden, and a risk of breaking the example code, either by incorrect or by forgotten replacements. But what matters more than the question of whether we *should* ask authors to do such replacements is that currently, we do not, because the manual page macros sets have for quite some time been exempting manual pages from the general rule "` means opening quote", stipulating "` means ASCII 0x60" instead. It's exactly the same question as for hyphen/minus. In roff(7) in general, - unambiguously means "hyphen" and not "minus". But in manual pages, the macros instead stipulate "- means ASCII 0x2d HYPHEN-MINUS". But in the present thread, the question is not so much "how should authors *encode* opening quotes", but more "how should unambiguously marked-up opening quotes be rendered to ASCII". >>> Whom is this change is meant to benefit? I've lost track. >> People reading roff(7) documents with nroff(1) or man(1) in a terminal >> window while they have LC_CTYPE=C set and while they are using a >> modern font. > Colin pointed out that remote machines may not support his locale so > he's forced into LC_CTYPE=C sometimes. However, that's presumably just > for the odd bit of command-line work as lack of UTF-8 could affect might > more than just reading a man page given non-ASCII in source comments, > collating order and multi-byte sequences affecting searching, etc. True. On the other hand, in such a situation, you are almost always working only with files that are ASCII-only in the first place. Or if you do work with locale-encoded files, you better make sure that you have the correct locale on both sides, or you are indeed in for bad trouble. (Having non-ASCII bytes in source code would be utter stupidity in the first place.) The point here is that even if everything is just ASCII, Colin (and others) sometimes see this kind of minor ugliness in everyday life. > I normally use UTF-8. I have ~/bin/C that does > LC_ALL=C LANG=C exec -- "$@" > to run particular commands in that locale, e.g. for speed. I think if I > switched wholesale to the C locale for a terminal or session then I > would accept seeing `foo' rather than 'foo' as an attribute of that > locale rather than trying to force it to look like Unicode. Accept, yes, no doubt; and certainly, there is no silver bullet to make ASCII look like Unicode. > More important than `fixing' this IMO is having man pages that can be > input in an easy manner for simple things, e.g. `' in prose, and have > the output be good for the device used, i.e. ‘’. No doubt about that, but that's orthogonal to the patch. Yours, Ingo