As so often happens, Ingo and I are at loggerheads. At 2020-11-14T17:03:42+0100, Ingo Schwarze wrote: > I would strongly oppose copying the same text to multiple > documentation files. Apart from correctness and completeness, > conciseness is among the most important quality criteria for > documentation. So having the same text repeated in more than > one place is among the worst suggestions you could possibly > come up with.
I have to wonder how familiar you are with programming texts. Why would anyone read K&R when they can absorb the ISO C standard? You've identified multiple virtues: correctness, completeness, and concision. One you've omitted is comprehensibility. When people are learning (or refreshing themselves on) a technical system, they attempt to find a document of high overall relevance to their immediate goal. Sometimes that goal is highly specific ("I need to know what command-line option of foo(1) with frobnitz boojums.") and sometimes more general ("I used AT&T troff 20 years ago and I remember the broad principles but I want to see what groff is like."). I posit that you cannot construct a corpus of documentation wherein every true statement is asserted at most once, and reliably cross-referenced from all other conceivable points of corollary interest. Documentation is an art, not a science, and even in Russian-style mathematical literature (assumption, lemma, proof, repeat; no discussion), which I have to presume is your model, people encounter barriers to the Platonic ideal. Moreover, there is a rule of pedagogy: repetition legitimizes. To get a concept across it often must be presented multiple times. > In general, automatically generating documentation is a bad idea. This claim is vacuous. We do it all the time. mandoc does it with man pages. You intend to say _something_ here, but I'm not sure what. Moreover, your claim, as far as I can interpret it, implies the very opposite of your earlier claims. If complete, correct, concise documentation were formally modelable, then it would indeed be amenable to automatic generation. You could perform a dependency analysis on it and automatically generate a totally ordered document beginning with axioms and proceeding with increasing ramifications until all claims were exhausted. (There may not be a _unique_ total ordering for reasons which graph theorists and package management system writers are all too familiar with, but you can find _some_ total ordering.) Whether intended or not, inclusion of tsort(1) in early Unix systems was a genius move for stimulating generations of hackers to pick up some discrete mathematics. (I've used it for all sorts of things, including planning an academic course of study.) Off the top of my head, were we to adopt your approach to our man pages, here are a few of the things we would lose. 1. Options sections. Fair's fair, you've already stated objections to these existing at all (IIRC, you prefer options being raised inline as part of the Description section, and I changed our nroff(1) page at your prompting to do this.) At the same time, if the Options section exists, you want it to precede a great deal of other material, even if that material would be far more usefully presented before. See refer(1). How comprehensible are most of those options before the refer command language has been described? On top of this, many options have common meaning across multiple commands (a practice you elevate to principle, if I recall correctly). So, under your ideal, a command's man page should only document options that it _uniquely_ recognizes, and all others should be behind a cross-reference. 2. Files sections; few commands open a non-operand file uniquely. In groff, many pages make reference to device, font, and macro files. Most of these would need to move to some other document to which others would refer. 3. Environment sections; similar. 4. Synopsis. Yes. The synopsis says nothing that cannot be inferred from .TH and wherever else the options and operands are documented. 5. See also. There is surely no reason for cross-references at the end of a document when they should be present already at the point they are required in the discussion. 6. Bugs. A URL to a bug tracker or a master bug list, since some bugs can impact multiple components of the system, would eliminate redundancy. > It is intended to be read by human beings, who will spend their > time on reading it, and that's a valuable resource being spent. Yes. > So, if authors can't even be bothered to properly *write* the text - > knowing that the time for writing it will only be spent once, This claim is startling. How much formal writing do you do? A significant portion of the time spent in the crafting of any serious document is in revision. Even if you didn't think this was true in general, it's plainly true of _me_ as any look at groff's commit history will attest. > whereas the time for reading it will be spent many times - how can > anybody reasonably be expected to read it? They will read it if it effectively communicates what they want to know. One measure of effectiveness is how swiftly they can get in and get out without the mental effort of maintaining a lot of state; that is, having to chase a long chain of cross-references to get to the one unique place where a fact is recorded in Bibliotecha Schwarze. Another factor in readability is the hedonic benefit of experiencing the prose. This is a highly subjective factor, and a virtue too often absent from technical literature, which is why it has a reputation as dry and boring. But it also explains why the most successful works in this discipline endure--because the writer is a talented stylist, has an agreeable tone and/or sense of humor. Once I learned enough math to comprehend portions of Knuth's _Art of Computer Programming_ I was surprised--though I should not have been--at how lucid and fun he was to read. I suspect that were you to take your editorial approach to his work, it would swiftly become unrecognizable. > Autogenerating documentation is just disrespectful of users, and so is > copying duplicate text into multiple places. I encourage you to invest some time in seriously pursuing this claim with Wittgensteinian rigor and see where you end up. > Yes, the current disaster with groff_man(7) / groff_man_style(7) > should be fixed at some point after release. I think Branden > probably didn't intend it to stay this way, You guess wrong. It's pretty close now to where I intended it, and I'm far happier with it now than I was after groff 1.22.4. Amusingly enough, it was in part at _your_ insistence that I restructured it as it is, with a parent document that uses m4 to generate (1) a reference page, whose importance you emphasized, and (2) a pedagogical document for man page writers who care nothing about typesetting or any feature not directly relevant to their man(7) endeavors. > but did it as an intermediate step in the complex task of > disentangling a large and complicated page into two logically separate > parts. I think the document will continue to evolve, namely to incorporate an introduction covering "filling" and "breaking" and other concepts that are alien to plain-text-only users. But this is exactly the sort of material you don't want there, even though it is essential to groff_man_style(7)'s mission of being "helpful and accessible to man page writers who may never read any other groff documentation." (NEWS) Eventually, I'd like to have a list of "do"s and "don't"s, some applicable to all man pages and some specific to the groff man pages' "house style". I would extract many of these from my commit messages. This has been my plan for a long time, which is why my documentation commits look the way they do. But I have no clear idea of a time frame for that work. Regards, Branden
signature.asc
Description: PGP signature