Thanks Branden, Collin, Sam and others for your thorough responses. I apologise for my misassessment of the GNU project. I'm glad to hear they actually do have a policy on this point. I only regret that it's not more fully implemented.
> GNU is the last serious hold-out, and "this is how we've always done > > it" won't wash any more. > The last time I wrote about this - probably around 2000 - the answer was indeed "this is how we've always done it", accompanied by some oblique insults to my intelligence. So I was, unfortunately, primed for a fight. I am heartened and reassured by the responses indicating that there isn't much need for a fight now. At 2025-08-24T15:50:04+1000, Martin D Kealey wrote: > > I note that much of the documentation I should have been clearer: I'm talking about *all* documentation, not just documentation read by users. "Documentation" doesn't only mean info & man pages; it also means natural language text that is intended to be read by developers, including comments in code and whole files such as AUTHORS, CWRU/OS-BUGS/*, README and HISTORY, or read by distributors, in whole files such as CHANGES and INSTALL. > At 2025-08-24T09:03:31+0300, Oğuz wrote: > > How? Is there a portable roff macro that produces them when supported > > and fall back to regular double quotes otherwise? > Yep; doc/bash.1 has .Q and .QN macros. On Sun, 24 Aug 2025 at 16:59, G. Branden Robinson < g.branden.robin...@gmail.com> wrote: > I decline to address how Chet maintains plain text files in > his distribution. > Well that just side-steps basically everything that triggered me to write about this in the first place. It is exactly "plain text files" that I find particularly vexing. Whilst I appreciate your thorough response, several of its points are shooting at phantoms that do not represent my position, and your exclusive focus on roff and texinfo means we were talking past each other. Consider gathering facts before writing intemperate screeds. I'll admit I could have been more thorough, but I didn't start with nothing. And perhaps we could both be less intemperate? I surveyed the complete sources of some GNU projects starting with bash, coreutils & autoconf. However it appears that I partially misinterpreted what I found: - I had used both "git clone" and "apt source" to acquire sources, so some of the files were more out of date than I expected, and some weren't actually on Savannah at all. - I neglected to omit *.texi files when counting ` and ``, so I did not notice that most GNU projects are substantially better than I was expecting, and unworthy of the criticism implied by my previous email. In contrast, within Bash, there continues to be extensive - even growing - use of ` in English prose that is not governed by machine-readability constraints, mostly in comments (in various file types) and inside files like bash/CHANGES*, bash/NEWS*, NOTES, bash/README, and bash/CRWU/**log*. So it looks like Bash is a hold-out within the GNU project, and perhaps I was right to come first to this mailing list after all. > > new documentation is still being added that follows this style. For example, since 2011-12-12 (when GNU policy changed to discourage using ` as a quote mark), bash/CHANGES has had 485 lines added that include `quotes like this', at a fairly consistent rate. CHANGES-5.3 alone has 121 instances over 98 lines. Do you plan to shift the world's entire corpus of TeX documents as well? > Since this comment concerns user-facing documentation that goes through translation layers, I don't really care. It's the supposedly "plain text" -- where there's no translation layer -- that's got me worked up. (I'll admit there's a purist corner of my heart that would include texinfo in my 3 wishes to an omnipotent djinn, but practically, no, this is not my hill to die on.) > is in the wrong character class (for line wrapping and hyphenation), > > No idea what you're talking about here Unicode character class. > I propose, at minimum, that the U+0060 grave accent be replaced > > wherever it's been misused as an opening quote > I counter-propose that *roff and TeX documents should use the > conventions for eliciting typographer's quotes from an output device > that are supported by the formatting system's documentation. > Okay, let's just wind this back for a moment; by "misused" I mean where the only constraint is natural language grammar, not machine-readable syntax. (Nor am I proposing to eliminate the `command substitution` syntax, though that's also on my djinn list.) In any case, quotes in roff files are handled by the .Q and .QN macros. There simply aren't any uses of ` as an opening quote in the running text; just a handful in comments that don't affect in the output (and exactly one use of `` in each of the definitions of .Q & .QN). > * In HTML documentation > A typesetter or formatter takes care of this. > Assuming the HTML is generated by a typesetter, sure, but that's hardly the norm. Some (maybe all) of the HTML files in Bash appear to be manually maintained, as indicated by some very peculiar whitespace. > * Strings that are compiled into Bash have to be displayable on terminals > that lack unicode support. To be clear, the entire thrust of my argument is to use typographic quotes everywhere except where they can't work. In this list of points I was playing devil's advocate, merely *acknowledging* cases where this might not be feasible. Any suggestion that I object to programs emitting typographic quotes to a terminal is the exact opposite of my position. I'm just admitting that this won't work with _all_ terminals, so a fallback mechanism is needed. > > * Man pages, info files, and other stuff that gets locale handling can > > use en.UTF-8 as the primary version, and generate C/POSIX (ASCII-only) > > from that. > For man pages, my suggestion and responses are all moot: there aren't any ` in the running text in the roff source files. I defer to others on how best to manage texinfo. For other locale-aware strings (cf gettext), my point is that it's trivially easy to reduce typographic quotes down to ASCII ones, but much more difficult to go the other way. My initial preference would be to simply write the typographic quotes in the literal strings in the source code, then modify gettext so that if no translation is available and the locale is not UTF-8-enabled, it returns a modified copy of the original string that downgrades all the typographic quotes to ASCII ones. But I'm sure other people have equally valid suggestions, and perhaps reasons why my preference wouldn't work in practice. (For example, are there any common C compilers that would reject non-ASCII octets in literal strings in C source code?) Perhaps the question of "is the locale UTF-8 enabled" might be awkward to answer on some platforms; GNU locale names follow a pattern of LANG=language[_region][.encoding][@variant] ( https://www.gnu.org/software/libc/manual/html_node/Locale-Names.html) but there doesn't seem to be a formal specification for valid encoding or variant values, so I don't know how portable "UTF-8" is (in other contexts I've seen lower-case and different or no punctuation). In any case, it would seem appropriate to adopt a conservative approach that assumes ASCII-only unless there's an unambiguous indication that the terminal supports UTF-8. > * m4 This request is off-topic for Bash mailing lists There are a handful of m4 files within the Bash project. I've also highlighted m4 because it represents an edge case, where the constraints imposed by machine readable syntax are not inviolate; so it's worth considering whether they That said, I've checked them more thoroughly, and it appears they all start with the equivalent of "changequote([,])", or are subordinate to files that do. So they all meet my stated requirement to be exempt from change, and this point can be considered "done". > > * Other stuff - what have I missed? I'm still open to useful suggestions on this point. I live wholly in a Linux environment, so feedback from people building Bash for other platforms would be particularly helpful. -Martin