Hi Alex & Ingo, At 2025-01-08T19:31:04+0100, Alejandro Colomar wrote: > If you want an actual manual page where this would make sense, look at > dl_iterate_phdr(3).
This is a better place to start than the specimen you offered first, in my opinion. As I've said elsewhere, I think the objective of man(7) should be "to get a few nines of the job [of man page formatting] done". mdoc(7) goes for 100%. The impression I get from its advocates is that anything that can't be reasonably achieved in its macro language can go take a flying leap (into some other documentation format, presumably). Hence its ambivalent relationship with tbl(1), for example. https://cvsweb.bsd.lv/mandoc/tbl.c?rev=1.47&content-type=text/x-cvsweb-markup&sortby=date (That said, I wholeheartedly endorse Ingo's view of docbook-to-man.) > int dl_iterate_phdr( > typeof(int (struct dl_phdr_info *info, size_t size, void *data)) > *callback, > void *data); Here's how I'd lay that out using groff Git. $ cat EXPERIMENTS/dl_iterate_phdr.3 .TH dl_iterate_phdr 3 2025-01-09 "groff test suite" .SH Name dl_iterate_phdr \- walk an ELF object yadda yadda yadda .SH Synopsis .B int .SY dl_iterate_phdr ( .BI typeof(int\~struct\~dl_phdr_info\~* info , .BI size_t\~ size , .BI void\~* \~date )) .BI * callback , .BI void\~* data ); Here's how that formats using the default line length. $ nroff -ww -r CHECKSTYLE=4 -man EXPERIMENTS/dl_iterate_phdr.3 dl_iterate_phdr(3) Library Functions Manual dl_iterate_phdr(3) Name dl_iterate_phdr - walk an ELF object yadda yadda yadda Synopsis int dl_iterate_phdr(typeof(int struct dl_phdr_info *info, size_t size, void * date)) *callback, void *data); groff test suite 2025‐01‐09 dl_iterate_phdr(3) And here it is using the traditional (and minimum practical) line length of 65n. $ nroff -ww -r CHECKSTYLE=4 -r LL=65n -man EXPERIMENTS/dl_iterate_phdr.3 dl_iterate_phdr(3) Library Functions Manual dl_iterate_phdr(3) Name dl_iterate_phdr - walk an ELF object yadda yadda yadda Synopsis int dl_iterate_phdr(typeof(int struct dl_phdr_info *info, size_t size, void * date)) *callback, void *data); groff test suite 2025‐01‐09 dl_iterate_phdr(3) I get no warnings, style or otherwise, and the formatting looks fine to me. But I think I see what you're talking about. You want to impose an additional constraint on the formatting, such that each formal argument to the function is typeset on one line. I'm not sure that's a reasonable goal. In 1979 when Doug McIlroy wrote man(7), no one would have dreamed of using the lengthy identifiers we have now. Indeed, it apparently took the immediate and explosive popularity of the curses library of 4BSD (1980), which merrily gobbled up symbols from the global C symbol name space like "OK", "ERR", "TRUE", "FALSE", "move", and "refresh" for people to realize that, hmm, the name space is something that might be wanting curation, informally at least. We know that no one dreamed of it because even a decade later, with ANSI C, the wise men stroked their long beards when considering the problem and said, "no, the supported length for symbols with external linkage is still only 6, just like the IBM linkers of old; you can stick a prefix on that, and fight it out among yourselves".[1] Linker vendors must have been the crustiest, most hidebound people on earth in those days. I guess they had previously been compiler people who switched focus after encountering Ada in the early 1980s. ("Boo hoo, the language wants us to do static analysis. So unreasonable.") > There are a few other ones too (some pthread_*() functions have such > long function names that I need to wrap the first parameter). I award the Prolixity Prize to Erlang, and memorialized it in a test script. .TH CosNotifyChannelAdmin_StructuredProxyPushSupplier 3erl 2021-05-31 "groff test suite" "Erlang Module Definition" .SH Name CosNotifyChannelAdmin_StructuredProxyPushSupplier \- OMFG At 2025-01-08T20:57:04+0100, Ingo Schwarze wrote: > New syntax ought to support semantic markup. Broadly agree here. Except for my planned "keep macros", `KS` and `KE`, which mandoc(1) can harmlessly ignore forever if it wants. > So you are talking about a combination of very long command names > with very long arguments causing ugly formatting by overrunning the > right margin (in some output modes). > > None of that requires author intervention to solve because if desired, > the macro set can automatically detect overruns and take appropriate > action. That, and the man page author can specify breakpoints with the `\:` escape sequence, which is blessed as portable among man(7) implementations that are actually maintained. groff_man_style(7): \: Insert a non‐printing break point. A word can break at such a point, but a hyphen glyph is not written to the output if it does. The remainder of the word is subject to hyphenation as normal. You can use \: and \% in combination to control breaking of a file name or URI or to permit hyphenation only after certain explicit hyphens within a word. See subsection “Hyperlink macros” above for an example. \: is a GNU extension also supported by Heirloom Doctools troff 050915 (September 2005), mandoc 1.13.1 (2014‐08‐10), and neatroff (commit 399a4936, 2014‐02‐17), but not by Plan 9, Solaris, or Documenter’s Workbench troffs. (I repeat my hedge from a recent thread as to whether/how much Plan 9 troff is maintained. Solaris 10 and DWB troffs are absolutely not. Illumos troff could be but isn't.) > A particularly simple way to achieve that would be to build a maximum > indentation into .SY and let man(7) wrap the line before the arguments > if the length of the command name exceeds that maximum, similar to > what the groff_man(7) manual page describes for .TP, except that a > modern language should not allow the document author to manually > specify the width like .TP does - at least not for a macro that is > intended to be semantic, like .SY. Agree. But also if I'm understanding you correctly, that is already the way the formatter works. roff(7): Once an output line is full, the next word (or remainder of a hyphenated one) is placed on a different output line; this is called a break. In this document and in roff discussions generally, a “break” if not further qualified always refers to the termination of an output line. When the formatter is filling text, it introduces breaks automatically to keep output lines from exceeding the configured line length. After an automatic break, a roff formatter adjusts the line if applicable (see below), and then resumes collecting and filling text on the next output line. groff man(7)'s `SY` macro disables adjustment (because traditionally, no one typesets synopses with adjustment), and therefore you won't suffer any warning if the line can't be adjusted, a problem that threatens with long unbreakable identifiers in other contexts. > So, let's break the line before the first parameter if it would overrun > the right margin (-rLL=NNn), and automagically calculate an appropriate > indentation for the first parameter. > > As for the right indentation, I'd make it the exact indentation that > would make the first parameter touch the right margin, with a minimum > indentation of 2n (being such a rare case, I'd hardcode this value; it > shouldn't be hit under normal conditions). Let's write some examples: > > foo baaar | > foo baaaar | > foo baaaaar| > foo | > baaaaaar| > foo | > baaaaaaar| > foo | > baaaaaaaar << Overruns the right margin. > foo | > baaaaaaaaar << Overruns the right margin. <blink> I think that _at most_ I'd be willing to add another formatting-time style register for this. I don't want the man(7) language in which documents are composed to carry this freight. It's too fiddly and subjective. At 2025-01-09T20:07:34+0100, Ingo Schwarze wrote: > Sounds better, but still not like a fully thought-through analysis of > the problem. For example, it's not necessarily the first argument > that is long. Consider this real-world example from an actual manual > page: > > void > SSL_CTX_sess_set_remove_cb(SSL_CTX *ctx, > void (*remove_session_cb)(SSL_CTX *ctx, SSL_SESSION *)); Good example. The already documented example of bsearch() should have primed the ambitious synopsizing page author to consider that. groff_man_style(7): We might synopsize the standard C library function bsearch(3) as follows. .P .B void *\c .SY bsearch ( .BI const\~void\~* key , .BI const\~void\~* base , .BI size_t\~ nmemb , .BI int\~(* compar )\c .B (const\~void\~*, const\~void\~*)); .YS man produces the following result. void *bsearch(const void *key, const void *base, size_t nmemb, int (*compar)(const void *, const void *)); You can see right there that I don't have a problem with a formal argument of pointer-to-function type breaking across a line. In fact, I regard it as likely (when it even comes up). But then, I seem to remember Alex has said repeatedly that he hasn't actually gotten around to _reading_ groff_man_style(7) yet... > By the way, in mdoc(7), writing that is totally straightforward > for documentation author: > > .Ft void > .Fo SSL_CTX_sess_set_remove_cb > .Fa "SSL_CTX *ctx" > .Fa "void (*remove_session_cb)(SSL_CTX *ctx, SSL_SESSION *)" > .Fc > > The mdoc(7) language automatically breaks the line before the long > argument, even though it's the second one, and proceed with an > indentation of 4n. $ cat EXPERIMENTS/SSL_CTX_sess_set_remove_cb.man .TH SSL_CTX_sess_set_remove_cb 3 2025-01-09 "groff test suite" .SH Name SSL_CTX_sess_set_remove_cb \- use world's most brilliantly designed API .SH Synopsis .B int .SY SSL_CTX_sess_set_remove_cb ( .BI SSL_CTX\~* ctx .BI void\~(* remove_session_cb ")(SSL_CTX\~*ctx, SSL_SESSION\~*)" .B ) And here's how I'd set it in man(7). Results at 80n and 65n: $ nroff -ww -rCHECKSTYLE=4 -man EXPERIMENTS/SSL_CTX_sess_set_remove_cb.man SSL_CTX_se..._remove_cb(3) Library Functions Manual SSL_CTX_se..._remove_cb(3) Name SSL_CTX_sess_set_remove_cb - use world’s most brilliantly designed API Synopsis int SSL_CTX_sess_set_remove_cb(SSL_CTX *ctx void (*remove_session_cb)(SSL_CTX *ctx, SSL_SESSION *) ) groff test suite 2025‐01‐09 SSL_CTX_se..._remove_cb(3) $ nroff -ww -rCHECKSTYLE=4 -r LL=65n -man EXPERIMENTS/SSL_CTX_sess_set_remove_cb.man troff:EXPERIMENTS/SSL_CTX_sess_set_remove_cb.man:8: warning [page 1, line 9]: cannot break line SSL_CTX...move_cb(3) Library Functions ManualSSL_CTX...move_cb(3) Name SSL_CTX_sess_set_remove_cb - use world’s most brilliantly designed API Synopsis int SSL_CTX_sess_set_remove_cb(SSL_CTX *ctx void (*remove_session_cb)(SSL_CTX *ctx, SSL_SESSION *) ) groff test suite 2025‐01‐09 SSL_CTX...move_cb(3) Aha! I finally had a problem! So I make one change: $ diff -u EXPERIMENTS/SSL_CTX_sess_set_remove_cb.man{,.new} --- EXPERIMENTS/SSL_CTX_sess_set_remove_cb.man 2025-01-09 15:16:11.737805662 -0600 +++ EXPERIMENTS/SSL_CTX_sess_set_remove_cb.man.new 2025-01-09 15:19:00.653304820 -0600 @@ -5,5 +5,5 @@ .B int .SY SSL_CTX_sess_set_remove_cb ( .BI SSL_CTX\~* ctx -.BI void\~(* remove_session_cb ")(SSL_CTX\~*ctx, SSL_SESSION\~*)" +.BI void\~(* remove_session_cb ")\:(SSL_CTX\~*ctx, SSL_SESSION\~*)" .B ) $ nroff -ww -rCHECKSTYLE=4 -r LL=65n -man EXPERIMENTS/SSL_CTX_sess_set_remove_cb.man.new SSL_CTX...move_cb(3) Library Functions ManualSSL_CTX...move_cb(3) Name SSL_CTX_sess_set_remove_cb - use world’s most brilliantly designed API Synopsis int SSL_CTX_sess_set_remove_cb(SSL_CTX *ctx void (*remove_session_cb) (SSL_CTX *ctx, SSL_SESSION *) ) groff test suite 2025‐01‐09 SSL_CTX...move_cb(3) And if the closing paren stranded on a line by itself is an annoyance-- though I'd consider the fact that is also clarifies that the previous argument is of pointer-to-function type--I can prevent that break too. $ diff -u EXPERIMENTS/SSL_CTX_sess_set_remove_cb.man.new{,2} --- EXPERIMENTS/SSL_CTX_sess_set_remove_cb.man.new 2025-01-09 15:19:00.653304820 -0600 +++ EXPERIMENTS/SSL_CTX_sess_set_remove_cb.man.new2 2025-01-09 15:22:02.696758082 -0600 @@ -5,5 +5,4 @@ .B int .SY SSL_CTX_sess_set_remove_cb ( .BI SSL_CTX\~* ctx -.BI void\~(* remove_session_cb ")\:(SSL_CTX\~*ctx, SSL_SESSION\~*)" -.B ) +.BI void\~(* remove_session_cb ")\:(SSL_CTX\~*ctx, SSL_SESSION\~*))" $ nroff -ww -rCHECKSTYLE=4 -r LL=65n -man EXPERIMENTS/SSL_CTX_sess_set_remove_cb.man.new2 SSL_CTX...move_cb(3) Library Functions ManualSSL_CTX...move_cb(3) Name SSL_CTX_sess_set_remove_cb - use world’s most brilliantly designed API Synopsis int SSL_CTX_sess_set_remove_cb(SSL_CTX *ctx void (*remove_session_cb) (SSL_CTX *ctx, SSL_SESSION *)) groff test suite 2025‐01‐09 SSL_CTX...move_cb(3) My takeaway from this is a lesson that all typographers seem to acquire and, as a detail-oriented person, that Alex should too: At some point in typesetting we depart the realm of what is procedurally correct in all circumstances and run into corner cases where individualized judgments must be made, balancing semantic clarity with typographic artistry. I'd furthermore articulate the principle that if something is inelegant but clear, even if it requires close reading to yield that clarity, keep it as-is before contorting it in the pursuit of elegance. > Probably, simply always using 4n would look better and more uniform. It is not, however, traditional in man pages. It think however that would be easy to support--it sounds like the same feature as above ("I think that _at most_ I'd be willing to add another formatting-time style register for this."). > With flush-right, you might get very large indentations that look > weird. Besides, KISS! - especially considering that name+argument > overflow is an unusual edge case in the first place. Good advice. > Also, while indentation conventions vary among projects (for example, > BSD uses 8n tabs for statements and 4n for continuations of the same > statement on the next line, whereas groff source code tends to use 2n > troughout IIUC), For stuff Clark originally wrote, yes. Some later contributors, even to files Clark authored, didn't respect his indentation convention (Werner Lemberg, I hasten to add, was _not_ one of these people). Some code that originates elsewhere (like BSD) or is in the contrib directory, doesn't follow Clark's conventions, understandably IMO. I try to respect whatever the prevailing convention is. When I have to break a long line in Clark C/C++ and need to indent it _and_ am not already in a parenthetical context, I use the previous line's indent+4n. As a rule. > If you really want to make the indentation variable in this special > case of name+argument overrun (rather than just using 4n), then > constraining it in the range from 2n to 8n inclusive would make > sense to me because i would consider tab settings outside that > range highly unusual in any source code formatting convention. I don't see any reason that a C function's synopsis in a man page has to exactly duplicate the appearance of its declaration in source code. There are several problems with pursuing a false equivalence here. 1. C/C++ prototypes/declarations generally start in column 1. A man page synopsis will not. 2. Some code wants the function/symbol name to start in column 1, pushing a function's return type to rest alone on the previous line. This is to make them easy to grep(1) (when ctags(1) or similar is unavailable, unused, or eschewed by the callow). But you don't grep(1) man pages this way. 3. Man page text needs to be adaptable to variable line lengths within a reasonable range (65-80n, I say). A code project is either the Wild West or has a single mandatory line length that is enforced on all its developers using whips and thumbscrews. 4. Some C/C++ language styles omit the names of formal parameters in declarations (cf. definitions), recording only their types. Stroustrup is famous for this, and Clark followed that convention in much of his code.[2] I personally disagree with it, but more importantly, the practice seems to be much rarer in man pages. I suspect this is because a function's man page often wants to _discuss_ its formal arguments in an unambiguous manner. Consider this synopsis: char *strstr(const char *, const char *); Doesn't illuminate much, does it? Regards, Branden [1] https://stackoverflow.com/questions/38035628/c-why-did-ansi-only-specify-six-characters-for-the-minimum-number-of-significa "peterh" gives the reasonable-sounding, battle-hardened veteran's answer. But we also see John Mashey show up to claim credit (blame?) for the first widely adopted extensions to Nils-Peter Nelson's string.h: strncat and strncpy. I guess strncmp and strnlen came later, possibly from other hands. https://minnie.tuhs.org/cgi-bin/utree.pl?file=pdp11v/usr/include/string.h [2] I suspect this is so that he could more easily tell by eyeballing a header file when he was attempting a function overload that would be invalid because it was duplicative. C++ was initially developed when compilers produced as few diagnostics as possible. Further, the objective of a compiler was to do its damnedest to produce assembly output--ANY assembly output--and not to sweat trivialities like the correctness of the input program. So Stroustrup maybe couldn't count on his compiler (or even his own Cfront) to warn him if he had colliding overloads. Regardless, I think it was a bad tradeoff. One way experienced programmers learn an API is by reading the function declarations. The creator of an API can help people out a lot by picking meaningful names not just for symbols but for formal arguments. And in a man page, a formal argument's name doesn't even need to be a valid C identifier; that's why I recommend hyphenated noun phrases for them. That makes them convenient to discuss in the text of the man page. Of course, neither of these practices is the Rock Star Way.
signature.asc
Description: PGP signature