At 2025-08-24T15:50:04+1000, Martin D Kealey wrote: > I note that much of the documentation still uses a quoting style that > pretends that characters U+0060 and U+0027 are matching opening and > closing quotes, and that new documentation is still being added that > follows this style. For extra credit, they're sometimes redoubled as > `` and '' to be fake double quotes.
Yes, because that's idiomatic *roff input. See below. Input is not output. TeX uses the same input convention. Do you plan to shift the world's entire corpus of TeX documents as well? > The use of the grave accent symbol as if it were a quote mark is > visually asymmetric (ugly!), Yes. > has semantic conflicts (including with its use as a shell > metacharacter), Yes. > is in the wrong character class (for line wrapping and hyphenation), No idea what you're talking about here--maybe Bash's Texinfo manuals, since this claim is false for groff. GNU troff by default assigns no more special properties to "`" than it does the grave accent per se (accessible via the special character escape sequence `\[ga]`). (And none of its stock macro files use `cflags` requests to override this character's defaults, either.) https://cgit.git.savannah.gnu.org/cgit/groff.git/tree/src/roff/troff/input.cpp?h=1.23.0#n6960 > disregards all formal specifications (Unicode-16.0.0 > (2024) still says "grave accent"), And most people call "\" a "backslash" rather than a "reverse solidus". Big deal. The "formal specifications" upon which you're loading so much rhetorical freight and frenzied gesticulation are not applicable here. > and is extremely outdated (ASA X3.4-1963 said "diacritic" 62 years > ago). It was used for constructive overstriking on typewriters (and teleprinting terminals); in that sense, it was indeed a diacritic. The ASCII standard of 1963/1968 expected certain code points to do double-duty as spacing and combining ("diacritic") characters, and the glyphs of the typeface of the Teletype Corporation Model 37 exhibited conformance with that expectation. See, for example, the ascii(7) man page of early Unix manuals. From what I've seen, the high-flown neutral double quote is a dead giveaway for Model 37 output. You see it often in early Unix papers. http://bitsavers.informatik.uni-stuttgart.de/pdf/att/unix/2nd_Edition/UNIX_Programmers_Manual_2ed_Jun72.pdf > A more thorough analysis is provided by Markus > Kuhn <https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html>. Markus has long hosted several excellent resources. Sadly for your readers, you reflect his erudition poorly. > GNU is the last serious hold-out, and "this is how we've always done > it" won't wash any more. As Sam James noted, you simply don't know what you're talking about here. Consider gathering facts before writing intemperate screeds. > I propose, at minimum, that the U+0060 grave accent be replaced > wherever it's been misused as an opening quote, but a better result > would be to replace both, using paired Unicode ‘typographic’ quotes > where possible. Wherever redoubled `` '' pairs have been used, they > should be replaced by the corresponding double quote characters. I counter-propose that *roff and TeX documents should use the conventions for eliciting typographer's quotes from an output device that are supported by the formatting system's documentation. > Whether to use Unicode ‘quote’ style, or just stick with ASCII 'quote' > style, depends on context: It's more than context; these are different problem domains. > * In HTML documentation, not using typographic quotes lacks any > reasonable defence: any program that can show HTML can also cope with > Unicode. Any editor whose keyboard doesn't have typographic quotes can > type HTML entities instead. A typesetter or formatter takes care of this. > * Strings that are compiled into Bash have to be displayable on > terminals that lack unicode support. Either they need to be written in > pure ASCII, or the output function needs to replace typographic quotes > with ASCII ones. (Consider augmenting gettext() to do the latter as > its fallback.) I'll defer to a more seasoned Bash developer to answer this, but GCC, for instance and contra your implication, does actually use typographer's quotes in diagnostic messages. Do you pay any attention to the behavior of the programs you complain about? For example, I just introduced a syntax error in a C++ source file to provoke error diagnostics from the compiler. ../src/roff/troff/input.cpp: In function ‘void init_hpf_code_table()’: ../src/roff/troff/input.cpp:7975:35: error: expected ‘;’ before ‘}’ token $ g++ --version | head -n 1 g++ (Debian 10.2.1-6) 10.2.1 20210110 > * Man pages, info files, and other stuff that gets locale handling can > use en.UTF-8 as the primary version, and generate C/POSIX (ASCII-only) > from that. In the contemporary world, man pages get formatted either by groff or by OpenBSD's mandoc(1), a tool with a much smaller charter than groff. At least in its currently released version, it doesn't appear to honor your desire. You can charge at Ingo Schwarze with your jeremiad; I expect his response would be entertaining. groff already takes the character repertoire of the output device into account. $ printf \`foobar\' | nroff -T ascii | cat -s `foobar' $ printf \`foobar\' | nroff -T latin1 | cat -s `foobar' $ printf \`foobar\' | nroff -T utf8 | cat -s ‘foobar’ Now, that said, Debian and some other distributors of groff unfortunately cripple man page rendering in this precise respect,[1] because, at least in Debian's case, the package maintainer gets too many harangues from people like you who go off half-cocked. The groff_man_style(7) page in the forthcoming groff 1.24 release will have (something similar to) the following Q&A. groff_man_style(7): • When and how should I use quotation marks? As noted above in subsection “Font style macros”, apply quotation marks to “brief specimens of literal text, such as article titles, inline examples, mentions of individual characters or short strings, and (sub)section headings in man pages”. Multi‐ word literals, such as Unix commands with arguments, when set inline (as opposed to displayed between EX and EE), should be quoted to ensure that the boundaries of the literal are clear even when the material is stripped of font styling by, for example, copy‐and‐paste operations. groff, Heirloom Doctools troff, neatroff, and mandoc support all of the special characters \[oq], \[cq], \[lq], \[rq], \[aq], and \[dq] described in subsection “Portability” above. DWB, Plan 9, and Solaris 10 troffs do not. Historically, man pages used ` and ' exclusively for directional single quotation marks. However, in recent years, some distributors of groff have chosen to override the meanings of these characters in man pages, remapping them to their Unicode Basic Latin code points. Unfortunately, ` and ' are the only reliable means of obtaining directional single quotation marks in AT&T troff; in that implementation, often no special character escape sequences exist to obtain them. Further, AT&T troff’s special character identifiers, like its font names, were device‐ specific. To achieve quotation portably in man pages rendered both by AT&T and more modern troffs, consider adding a preamble to your page after the TH call as follows. .ie \n(.g \{\ . ds oq \[oq]\" . ds cq \[cq]\" .\} .el \{\ . ds oq `\" . ds cq '\" .\} You must then use the \* escape sequence to interpolate the quotation mark strings. The command .RB \*(oq "while !\& git pull; do sleep 10; done" \*(cq retries an update from the repository until it succeeds. If this procedure seems complex, petition your distributor to revert their remapping of the ` and ' characters. > * Translations should be encouraged to use their respective > typographic quoting style: „DE“, »DK«, «FR», ”HE„, „HU”, 『JP』 etc. > (See > https://en.wikipedia.org/wiki/Quotation_mark#Summary_table) > * Files with monospaced plaintext (CHANGES, HISTORY, etc) - either > 'ASCII' quotes or ‘Unicode’ quotes depending on what Chet can type. > * LICENCE/LICENSE - ask the respective licence-holders to provide > updated versions, or to ratify our "translation" (especially GNU & > BSD). I decline to address how Chet maintains plain text files in his distribution. > * m4 (aka “where did I put my seppuku blade?”) Add > changequote(,)changequote(`,')dnl to the start of all documents that > tacitly assume `', so that this assumption can eventually be > deprecated. This request is off-topic for Bash mailing lists; GNU M4 is a separate project. You might check out the bug...@gnu.org mailing list. > * Other stuff - what have I missed? Good question. Why don't you go conduct research and only then come back and _then_ tell us of your findings? > What do others think? > > -Martin > > PS: arguably I should have started this in coreutils or gnu-policy, Inarguably you should have gotten your facts in order before making an aggressively worded proposal. > but I'm starting here because ` is syntactically significant to Bash, > so there's extra damage. See above. At 2025-08-24T09:03:31+0300, Oğuz wrote: > How? Is there a portable roff macro that produces them when supported > and fall back to regular double quotes otherwise? There's a portable way, yes, but some distributors of groff have broken it because people when people like Martin D. Kealey turn their hand to writing man pages, they choose to scream rather than learn. So they produce bad man pages, and then people like Martin D. Kealey scream. At 2025-08-24T07:10:30+0100, Sam James wrote: > Martin D Kealey <mar...@kurahaupo.gen.nz> writes: > > GNU is the last serious hold-out, and "this is how we've always done > > it" won't wash any more. > > It is, in fact, not the last serious hold-out at all: > https://www.gnu.org/prep/standards/standards.html#Quote-Characters. > > I don't recall when it changed other than it being in the last few > years. At least a decade. https://web.archive.org/web/20150104113840/https://www.gnu.org/prep/standards/standards.html#Quote-Characters Regards, Branden [1] https://salsa.debian.org/debian/groff/-/commit/d5394c68d70e6c5199b01d2522e094c8fd52e64e
signature.asc
Description: PGP signature