Hi Branden, G. Branden Robinson wrote on Sun, Nov 15, 2020 at 11:49:48PM +1100: > At 2020-11-14T17:03:42+0100, Ingo Schwarze wrote:
>> I would strongly oppose copying the same text to multiple >> documentation files. Apart from correctness and completeness, >> conciseness is among the most important quality criteria for >> documentation. So having the same text repeated in more than >> one place is among the worst suggestions you could possibly >> come up with. > I have to wonder how familiar you are with programming texts. I love reading novels, even long ones, and often do so. But i usually refrain from reading books about programming because they tend to be too long for my taste. For example, i never finished Stroustrup because i got too bored with all the redundancy. Standards and references serve better in the field of programming. Why would i want a novel mixed into a technical text? > Why would anyone read K&R when they can absorb the ISO C standard? > > You've identified multiple virtues: correctness, completeness, and > concision. One you've omitted is comprehensibility. That's not a separate goal. It's a consequence of the other goals. Note that we are not talking about general pedagogy here like you would use it in elementary school, but about teaching programmers. > When people are learning (or refreshing themselves on) a technical > system, they attempt to find a document of high overall relevance to > their immediate goal. Sometimes that goal is highly specific ("I need > to know what command-line option of foo(1) with frobnitz boojums.") and > sometimes more general ("I used AT&T troff 20 years ago and I remember > the broad principles but I want to see what groff is like."). > > I posit that you cannot construct a corpus of documentation wherein > every true statement is asserted at most once, and reliably > cross-referenced from all other conceivable points of corollary > interest. > > Documentation is an art, not a science, That is of course true. Including that fact that some repetion is unavoidable. > and even in Russian-style > mathematical literature (assumption, lemma, proof, repeat; no > discussion), which I have to presume is your model, Not quite; i do want concise sentences inserted explaining the practical purpose of what is being described. > people encounter barriers to the Platonic ideal. True, too; and such barriers are invariable hit earlier than the ultimate barrier that, informally speaking, complete and consistent formal systems cannot be constructed. > Moreover, there is a rule of pedagogy: repetition legitimizes. > To get a concept across it often must be presented multiple times. That's why, if you give an oral presentation, you don't just read out the manual start to finish. That's why, teaching yourself, you don't just read the manual page once, start to finish. On first reading, you skip parts not jet relevant. Later on, you skip parts you are already familar with. But frankly, i hit the opposite problem far, far more often. In programming practice, it barely happens at all that you cannot understand a piece of correct, complete, concise documentation because insufficient pedagogical skill was used in explaining it. Programming is simple in principle. There is nothing really complicated like you find it in quantum field theory or in other mathical theories of similar complexity. But i hit the opposite problem all the time: that i waste lots of time figuring out whether i have a complete picture of all features related to my question in the language or system at hand because the documentation i too long, being mixed with irrelevant basics and organizing the material according to some pedagogical idea rather than systematically. >> In general, automatically generating documentation is a bad idea. > This claim is vacuous. We do it all the time. mandoc does it with man > pages. You intend to say _something_ here, but I'm not sure what. I'm talking about the text, not about the markup. Groff, mandoc, pod2man and the like merely translate (human-writeable) semantic annotations to (machine-readable) formatting instructions. They do not auto-generate any text, like you are doing it with groff_man / groff_man_style. > Moreover, your claim, as far as I can interpret it, implies the very > opposite of your earlier claims. If complete, correct, concise > documentation were formally modelable, You understood my point correctly that not being formally modelable is among the reasons why it needs to be written by hand. But even though not formally modelable, it still needs to strive for correctness, completeness, and conciseness as far as possible. I'm not responding to your of attempt at reductio ad absurdum line by line. Instead, let me just say that many of the goals of documentation conflict which each other, and then human judgement is needed to reach a balance - since we are optimizing for multiple goals at the same time, it cannot be an optimization in the mathematical sense, and trying to formalize this process of human judgement often proves counterproductive in practice. Let me provide some canonical examples. 1. The existence of the SYNOPSIS sections is an example of a compromise resolving a conflict of the goal of conciseness with itself. Yes, you are right manual pages would be shorter without SYNOPSIS sections and still be complete and correct. But the gain in conciseness by deleting the SYNOPSIS - not talking about excessive, multi-line SYNOPSIS sections here - would be very minor because the SYNOPSIS is usually so short compared to the rest of the page. On the other hand, the SYNOPSIS provides a huge gain in conciseness because ever so often, i only need to look at the SYNOPSIS to have my question answered. Some people argue for a third level of conciseness vs. completeness, e.g. the --help option. I consider that detrimental because it adds a larger amount of text than the SYNOPSIS for a lesser gain (because you already have both very concise and very complete docs even without --help.) So personally, i think documentation is better without --help. But i recognize that is a matter of opinion and some may make a different judgement call and prefer having this third level of conciseness, too. 2. Comprehensibility on first, serial reading almost always conflicts with comprehensibility during repeated study in detail, and again, striking a balance is needed. For example, before diving into the options list, there should be a short paragraph stating what the general purpose and the default behaviour of the program are. That is a huge gain for first serial reading, also handy for later revisiting as a concise reminder, and there is no major downside to a short paragraph. However, there must not be half a tutorial before getting to the meat of the matter. As another example, in a section 1 manual, section 5 material (for example the documentation of a domain specific language) must not be mixed in before documenting the program itself and its options. 3. The goal of completeness, as i understand it, implies that redirecting to different pages should be avoided when possible. Consequently, the goals of completeness and conciseness are usually conflicting with each other, and again, judgement is needed to strike a balance. For example, repeating short, formulaic sentences where appropriate is usually OK because reading them is much quicker and simpler than following a redirection; consider typical EXIT STATUS sections. But copying multiple long paragraphs of text into two distinct documents is an obvious (and an extreme) instance of totally missing this balance. 4. SEE ALSO is a typical example where subtleties are often misunderstood. That section shoud *not* have the same references as the main text. The purpose of that section is to point to other pages that, by their general topic, are likely of interest to readers of the present page. It should indeed *not* repeat each and every reference that was made in view of some specific detail in the main text. Conversely, many of the entries in SEE ALSO do *not* need to be repeated anywhere in the main text. >> So, if authors can't even be bothered to properly *write* the text - >> knowing that the time for writing it will only be spent once, > This claim is startling. How much formal writing do you do? I do it for a living. Right now. > A significant portion of the time spent in the crafting of any serious > document is in revision. Absolutely. My current average for documenting a single function in a C library is two hours of working time. Time spent on reading and analyzing the related code and time spent on revision of the descriptions are of the same order of magnitude, i guess. But i only spend those two hours once. Maybe some more time may be added to the bill if a way is found to improve it further, or if those pesky coders keep revising the API. But whatever version we are talking about: it gets written once and read many times. > Even if you didn't think this was true in general, it's plainly > true of _me_ as any look at groff's commit history > will attest. I do think it is true in general, and taking shortcuts during the arduous process of revising the text is among the most common reasons why documentation ends up being wordy and awkward. > They will read it if it effectively communicates what they want to know. > One measure of effectiveness is how swiftly they can get in and get out > without the mental effort of maintaining a lot of state; that is, having > to chase a long chain of cross-references to get to the one unique place > where a fact is recorded in Bibliotecha Schwarze. Absolutely. While providing cross-references that might help in some situations is valuable, avoiding cross-references that would have to be followed by next to every reader is important. But that doesn't mean duplicating substantial amounts of text is OK. That ends up having users compare text to figure out what the differences are. Your example is so bad that people will be tempted to use diff(1) to cope with it. > Another factor in readability is the hedonic benefit of experiencing > the prose. This is a highly subjective factor, and a virtue too often > absent from technical literature, which is why it has a reputation as > dry and boring. But it also explains why the most successful works in > this discipline endure--because the writer is a talented stylist, > has an agreeable tone and/or sense of humor. Once I learned enough math > to comprehend portions of Knuth's _Art of Computer Programming_ I was > surprised--though I should not have been--at how lucid and fun he was to > read. I suspect that were you to take your editorial approach to his > work, it would swiftly become unrecognizable. I don't doubt that, and i concede that the Steve Hensons of this world are more numerous than the Donald Knuths. But please, correctness, completeness, and conciseness are fundamental, critical requirements. If somebody is able to make the text endearing to read on top of that, all the better. But artistic finesse is not very valuable when the fundamental goals are being missed. And all that has nothing to do with making the reader read the same text twice. Also, in programming, it would seem more natural to me to search for hedonic benefit in reading *code* rather than reading *documentation* - and in that field, the Steve Hensons enjoy a similar numerical advantage as in documentation. ;-( >> Yes, the current disaster with groff_man(7) / groff_man_style(7) >> should be fixed at some point after release. I think Branden >> probably didn't intend it to stay this way, > You guess wrong. It's pretty close now to where I intended it, and I'm > far happier with it now than I was after groff 1.22.4. Amusingly > enough, it was in part at _your_ insistence that I restructured it as it > is, with a parent document that uses m4 to generate (1) a reference > page, whose importance you emphasized, and (2) a pedagogical document > for man page writers who care nothing about typesetting or any feature > not directly relevant to their man(7) endeavors. I'm not so surprised here. We do share lots of goals and even lots of ideas how to reach them, even though we disagree in a number of respects now and then. >> but did it as an intermediate step in the complex task of >> disentangling a large and complicated page into two logically separate >> parts. > I think the document will continue to evolve, namely to incorporate an > introduction covering "filling" and "breaking" and other concepts that > are alien to plain-text-only users. Explaining that a bit, in a more accessible way then the roff reference manual explains .fi and .ad and .br and the like, and add advice how to control filling and breaking portably and readably in a manual page, does seem useful in groff_man_style(7). It won't be the same text as anywhere else because it serves different readers for a different purpose. Nobody reading the roff reference manual will have to fear missing anything by not looking at groff_man_style(7). Nobody reading groff_man_style(7) will have to fear missing anything *relevant* by not looking at the roff reference manual. But groff_man(7) and groff_man_style(7) target exactly the same audience for exactly the same purpose. So they should not contain large amounts of duplicate text. Yours, Ingo