On synopsis grammar (was: Spaces in synopses of commands)

2023-07-31 Thread G. Branden Robinson
[adding groff list so that more people can argue with me, since I once
again found a soapbox to mount]

At 2023-07-30T18:14:53+0200, Alejandro Colomar wrote:
> On 2023-07-30 18:13, G. Branden Robinson wrote:
> > I think this is a matter of achieving an accurate and unambiguous
> > synopsis grammar.
> 
> Thanks; that kind of objective reasoning is what I wanted.  Would you
> mind stating it in the commit message for posterity?  :-)

I think I'll add it to the explanation of the example synopsis in
groff_man_style(7), too.  ;-)

While I'd love for synopsis grammar to be _fully_ unambiguous, one
unfortunate case did arise in discussion with mandoc maintainer Ingo
Schwarze on the groff mailing list in the past year or two.

Consider:

foocmd [-abort] file ...

Is this a command that takes up to 5 different options -a, -b, -o, -r,
-t, or a command that takes one option called "abort"?

A program in the BSD tradition might suggest one answer and a program in
the X11 tradition another.  I assume that this is not a new observation,
and is why the GNU project introduced (or adopted from some
now-forgotten progenitor) the double-dash long-option-name convention.

While we could eliminate the ambiguity by insisting upon a practice of
setting each short option in its own set of optional-argument brackets,
that would come at a significant cost in visual clutter.

Consider the groff(1) command, already ornamented richly with options.

groff [-abcCeEgGijklNpRsStUVXzZ] [-d ctext] [-d string=text]
  [-D fallback‐encoding] [-f font‐family] [-F font‐directory]
  [-I inclusion‐directory] [-K input‐encoding] [-L spooler‐
  argument] [-m macro‐package] [-M macro‐directory] [-n page‐
  number] [-o page‐list] [-P postprocessor‐argument]
  [-r cnumeric‐expression] [-r register=numeric‐expression]
  [-T output‐device] [-w warning‐category] [-W warning‐category]
  [file ...]

In a quest for zero ambiguity, we might say:

groff [-a] [-b] [-c] [-C] [-e] [-E] [-g] [-G] [-i] [-j] [-k] [-l]
  [-N] [-p] [-R] [-s] [-S] [-t] [-U] [-V] [-X] [-z] [-Z]
  [-d ctext] [-d string=text] [-D fallback‐encoding]
  [-f font‐family] [-F font‐directory] [-I inclusion‐directory]
  [-K input‐encoding] [-L spooler‐ argument] [-m macro‐package]
  [-M macro‐directory] [-n page‐number] [-o page‐list]
  [-P postprocessor‐argument] [-r cnumeric‐expression]
  [-r register=numeric‐expression] [-T output‐device]
  [-w warning‐category] [-W warning‐category] [file ...]

And with that done, we might as well lexicographically order all the
options.

groff [-a] [-b] [-c] [-C] [-d ctext] [-d string=text]
  [-D fallback‐encoding] [-e] [-E] [-f font‐family]
  [-F font‐directory] [-g] [-G] [-i] [-I inclusion‐directory]
  [-j] [-k] [-K input‐encoding] [-l] [-L spooler‐argument]
  [-m macro‐package] [-M macro‐directory] [-n page‐number] [-N]
  [-o page‐list] [-p] [-P postprocessor‐argument]
  [-r cnumeric‐expression] [-r register=numeric‐expression]
  [-R] [-s] [-S] [-t] [-T output‐device] [-U] [-V]
  [-w warning‐category] [-W warning‐category] [-X] [-z] [-Z]
  [file ...]

...but that doesn't seem like an improvement to me.  Options that don't
take arguments are typically of Boolean sense.  (Occasionally, as with
some applications of '-v', they model an incrementation operation of
some kind.)  "Argumentful" options require further decision-making from
the user and it thus seems useful, to me, to segregate the two
categories.  Some traditions evolve for good reasons.  :)

As an aside, one might wonder why the groff(1) page uses such long
metasyntactic variable names in 1.23.0 when it did not in 1.22.4.  After
years of working on groff's ~60 man pages, I came to adopt a handful of
principles.

1.  A command should always offer a usage message via '--help',
presenting a (plain text) synopsis much like the above.

2.  That synopsis, and the one in the corresponding man page, should
match.

3.  A _usage_ message should be _useful_.

$ foo --barblegarg
foo: error: unrecognized option 'barblegarg'
foo: usage: foo [options] [files]

is so un-useful as to be user-hostile.  A programmer who writes this
should be frank about their contempt for the user and drop such
"usage advice" entirely.[1]

Consider the novice user of groff.  They might wonder, "is lowercase
'm' the flag letter for the macro package name and '-M' the one to
add a macro search directory, or the other way around"?  Output like
I presented for it above answers such a question.

4.  A usage message should not dump an _explanation_ of all options.  A
person accustomed to the Unix command line philosophy of "no news is
good news" will rightly be dismayed when a command invocation they
expect to perform some task quietly and return to the shell prompt
instead

Re: Visual Color Reference with Swatches

2023-07-31 Thread Deri
On Monday, 31 July 2023 10:05:18 BST Alexis wrote:
> Hello folks,
> 
> I was looking for a visual reference of all the colors available with
> GNU roff. Yet a quick search on the mailing list archive did not
> produce any meaningful results. If you believe I missed related
> threads I'd appreciate pointers to these.
> 
> Please find attached groff_colors.ms which "creatively" (ab)uses GNU
> roff to produce a PDF with visual color swatches.
> 
> The output of groff_colors.ms can be customized by:
> 
>   * setting the `hex` register to 1 to include the RGB hex code
> for each color in the color swatch
> 
>   * defining the `src` string to display the colors for that output
> device, if that device has a corresponding device.tmac with
> color definitions using the .defcolor request.
> Note that the colors for the ps device are also available for
> the pdf device even though pdf.tmac does not define colors.

pdf.tmac includes ps.tmac so colours are defined.

Cheers

Deri

> Hopefully this proves useful to others.
> 
> 
> Best
> Alexis







Re: Visual Color Reference with Swatches

2023-07-31 Thread Alexis
Thank you for the clarification, Deri, that makes a lot of sense.
I should've read pdf.tmac more closely and not just grep'ed for
'^\.defcolor' before making such a claim :)



Re: Tilde (~) in bash(1) is typeset incorrectly as Unicode character

2023-07-31 Thread Chet Ramey

On 7/29/23 10:05 PM, Bjarni Ingi Gislason wrote:

   Simply add

.if t .tr ~\(ti

   to "tmac/an.tmac",
instead of changing (hard coding) it in the sources (man pages).


This is probably excellent advice for the distros, but not something bash
is going to do.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/




boldface, italics, spaces and ellipses in synopses of commands, and *nix history

2023-07-31 Thread G. Branden Robinson
Hi Lennart,

At 2023-07-31T11:48:22+, Lennart Jablonka wrote:
> Quoth G. Branden Robinson:
> > "[-v ...]" would imply that only "-v -v -v" is allowed, instead of
> > "-vvv".
> 
> Only if you can’t group options.

I term this "clustering", but yes.  And most Unix `argv` interpreters
support doing so, at least those that make it into libraries used by
more than one application.

> It is an issue that there are a few different options syntaxes and
> often, the specific one used is not documented.

I suppose it'd be too demanding to ask authors of command-line programs
to document which they use in the Synopsis sections of their man pages.

Imagine:

Synopsis
 tbl [-C] [file ...]

 tbl --help

 tbl -v
 tbl --version

 Arguments are parsed by getopt_long(3).

Only the last line is not already present.

If a lot of people actually _did_ this, then you'd know from its absence
that someone might be using a hand-rolled argument parser, and to watch
out for land mines for the usual quality-of-implementation issues that
accompany NIH syndrome.

> I’d argue that’s acceptable for those utilities adhering to the POSIX
> Utility Syntax Guidelines;  that is, those that just use getopt.  And
> thus,
> 
>   foobar [-v ...]
> 
>-v ...  Be more verbose.  This options can be specified
>multiple times to increase the verbosity level.
> 
> Makes it reasonably clear that you can make it very verbose by both
> -vvv and -v -v -v.

I don't have a mastery of those Guidelines but I accept that they can
contextualize the syntax enough to remove the ambiguity.

> Now, if you do not adhere to the guidelines—if you require -vvv or
> don’t allow grouping or both—you likely want a different synopsis
> syntax anyway: Then, -asdf could be interpreted as “the single-dash
> long options asdf” and you shouldn’t write the short options as -adX.

Yes.  It used to matter a great deal that nearly every X11 client
application in the world applied a different conventions here, but a
consistent one thanks to the ubiquity of the X Toolkit Intrinsics
library.  Eventually, when GTK+ and Qt came along, that convention was
discarded.

> None of this invalidates your explanation of ellipses and space
> therebefor.  But I don’t like your explanation.   Point is, I wouldn’t
> have gotten the idea of not putting a space there in the first place:
> An ellipsis is most always delimited by spaces, in synopses as in
> prose.
[rearranged]
> In POSIX, an ellipsis is not italicized and not delimited by spaces,
> as in
> 
>   p̲a̲t̲h̲...
>   [-o f̲o̲r̲m̲a̲t̲]...

Applying the rules of prose here is what makes me nervous about your
interpretation.  And POSIX's, for that matter; `argv` processing is far
less forgiving of whitespace errors than readers of prose are.

I have no serious beef with POSIX if they supply enough context in their
interpretation guidelines for people to make sense of their synopsis
notation.  The hazard lies in people who don't write to those guidelines
thoughtlessly aping POSIX's notation, unmoored from its context.

I find that two rules are popular among software developers.

1.  Don't write technical documentation.
2.  If you find you must write technical documentation, do it badly.

> Now, for opinions differing from yours:  In mdoc world, the ellipses
> frequently are part of the argument, as in
> 
>   .Ar path ...
> 
> and thus also italicized.

I know, and I hate it; the ellipsis doesn't merit italicization any more
than option brackets do.  mdoc(7) is expressive enough that you don't
actually type the brackets; you use its `Op` macro.  To me, it is
telling that, thus having full control of the styling of synopsis
brackets, mdoc's author(s) (most prominently Cynthia Livingston, in the
macro language's second and final edition) set them in roman.[7]

Thanks for raising this.  The fix was straightforward, and you can
expect it in my next push to groff Git.[9]

I speculate that mdoc's italicization of ellipses would have been
regarded as a problem, but that it was overlooked, and here's why.
Chronologically, it appears to me that Livingston was still working in
the pre-x86 Unix tradition, even if at what we now recognize to be its
twilight.  People still had access to, and consulted, typeset manuals.

https://minnie.tuhs.org/cgi-bin/utree.pl?file=4.3BSD-Tahoe/usr/doc/usd/

4.3BSD-Tahoe was 1988.  mdoc(7) arrived in 4.3BSD-Reno in 1990.

"The manual was intended to be typeset; some detail is sacrificed on
terminals." (man(1), _Unix Time-Sharing System Programmer's Manual_,
Eighth Edition, Volume 1, February 1985)

Try telling the difference between a roman ellipsis and an italic one,
when you can actually get italics instead of underlining.  I surmise
that, as a serious *roff macro package developer, Livingston did more of
her proofreading and validation with respect to typeset, rather than
terminal, output.  At least that's the approach I'd _recommend_: more
can go wrong on a typ

Re: boldface, italics, spaces and ellipses in synopses of commands, and *nix history

2023-07-31 Thread Lennart Jablonka

I’d argue that’s acceptable for those utilities adhering to the POSIX
Utility Syntax Guidelines;  that is, those that just use getopt.  And
thus,

foobar [-v ...]

 -v ...  Be more verbose.  This options can be specified
 multiple times to increase the verbosity level.

Makes it reasonably clear that you can make it very verbose by both
-vvv and -v -v -v.


I don't have a mastery of those Guidelines but I accept that they can
contextualize the syntax enough to remove the ambiguity.


Actually, reading it again, I would just drop the ellipsis.

foobar [-v]

 -v  Be more verbose.  This option can be specified
 multiple times to increase the verbosity level.


None of this invalidates your explanation of ellipses and space
therebefor.  But I don’t like your explanation.   Point is, I wouldn’t
have gotten the idea of not putting a space there in the first place:
An ellipsis is most always delimited by spaces, in synopses as in
prose.

[rearranged]

In POSIX, an ellipsis is not italicized and not delimited by spaces,
as in

p̲a̲t̲h̲...
[-o f̲o̲r̲m̲a̲t̲]...


Applying the rules of prose here is what makes me nervous about your
interpretation.  And POSIX's, for that matter; `argv` processing is far
less forgiving of whitespace errors than readers of prose are.


Oh, but synopses are prose!   You say that you want unambiguous 
synopses, I agree;  but you still need to read the options’ 
descriptions to know all about the utility’s command-line syntax.   
You don’t embed regular expressions for the arguments in the 
synopsis.   While there are styles for specifying different 
requirements, like having multiple symbolic command lines for 
alternatives with largely different syntaxes (e.g., a symbolic 
command line per sub-command), there are still requirements only 
expressed in the DESCRIPTION:  One option might be optional unless 
a certain other option is given.   You wouldn’t write 
[-a] [-b -a], you’d just write [-a] [-b], or rather [-ab].   If 
you should follow quite strict rules in you synopses (and you 
should), it still is more or less free-form.   And I’m not 
a forgiving reader of prose with whitespace errors.



I have no serious beef with POSIX if they supply enough context in their
interpretation guidelines for people to make sense of their synopsis
notation.  The hazard lies in people who don't write to those guidelines
thoughtlessly aping POSIX's notation, unmoored from its context.

I find that two rules are popular among software developers.

1.  Don't write technical documentation.
2.  If you find you must write technical documentation, do it badly.


Exactly.   Which means:  All we can and something we should do is 
have guidelines.   Saying nothing of what those guidelines should 
be.



Now, for opinions differing from yours:  In mdoc world, the ellipses
frequently are part of the argument, as in

.Ar path ...

and thus also italicized.


I know, and I hate it; the ellipsis doesn't merit italicization any more
than option brackets do.  mdoc(7) is expressive enough that you don't
actually type the brackets; you use its `Op` macro.  To me, it is
telling that, thus having full control of the styling of synopsis
brackets, mdoc's author(s) (most prominently Cynthia Livingston, in the
macro language's second and final edition) set them in roman.[7]


I’m inclined to agree, though here I value convention over 
taste—and the convention for mdoc manuals has been from the start 
to italicize the ellipses.



Thanks for raising this.  The fix was straightforward, and you can
expect it in my next push to groff Git.[9]



[9]

diff --git a/tmac/doc.tmac b/tmac/doc.tmac
index 70ec41ea2..6267d2a08 100644
--- a/tmac/doc.tmac
+++ b/tmac/doc.tmac
@@ -359,7 +359,7 @@
.  ie"\$1"|" \
.ds doc-arg\n[doc-arg-count] \f[R]|\f[]
.  el \{ .ie "\$1"..." \
-.ds doc-arg\n[doc-arg-count] \|.\|.\|.
+.ds doc-arg\n[doc-arg-count] \f[R]\|.\|.\|.\f[]
.  el \
.ds doc-arg\n[doc-arg-count] "\$1
.  \}


Heh, that sneaky spreading of ellipses is funny.   I don’t think it 
should be there, but whatever.   Do note that this only works if 
the ellipsis is its own argument:  Not having read doc.tmac more 
closely this looks like it somewhat contradicts your position on 
spaces with ellipses.


foobar [-o option] ...  \" .Ar ...

 -o option  Apply more options.  Because there’s only so
much space for single-letter options, you see.

The -o options:

.\" .Ar v...
 v...  Be verbose.  The more v, the more verbose.

Oops, now the first ellipsis looks better than the second one.   
Yeah, I dislike that hack.



I speculate that mdoc's italicization of ellipses would have been
regarded as a problem, but that it was overlooked, and here's why.


This sounds like a reasonable explanation.


Chronologically, it appears to me that Livingston was still working in
the pre-x86 Unix trad

Re: boldface, italics, spaces and ellipses in synopses of commands, and *nix history

2023-07-31 Thread G. Branden Robinson
Hi Lennart,

At 2023-07-31T18:40:11+, Lennart Jablonka wrote:
> > > I’d argue that’s acceptable for those utilities adhering to the
> > > POSIX Utility Syntax Guidelines;  that is, those that just use
> > > getopt.
[...]
> Actually, reading it again, I would just drop the ellipsis.
> 
>   foobar [-v]
> 
>-v  Be more verbose.  This option can be specified
>multiple times to increase the verbosity level.

That's fine.  Syntactical issues can always be kicked upstairs to the
semantic level.  At some cost.

> > Applying the rules of prose here is what makes me nervous about your
> > interpretation.  And POSIX's, for that matter; `argv` processing is
> > far less forgiving of whitespace errors than readers of prose are.
> 
> Oh, but synopses are prose!

Maybe we're using different definitions of "prose".  I reckon I'm using
it to mean "sentential, natural language utterances that are not verse".

> You say that you want unambiguous synopses, I agree;  but you still
> need to read the options’ descriptions to know all about the utility’s
> command-line syntax.

Normally, you don't.  Not to grasp its _syntax_.

> You don’t embed regular expressions for the arguments in the synopsis.

No, because synopsis syntax maps to a subset of regular expression
syntax.

foo -> foo
foo? -> [foo]
foo+ -> foo ...
foo* -> [foo] ...
(foo|bar)? -> [foo|bar]

It's a little clumsier, formally, to express alternation of a mandatory
argument.

baz (foo|bar) -> baz foo
 baz bar

This is a concern mostly for interfaces like tar(1)'s, which seems to
trip up everybody when they first learn it.

> While there are styles for specifying different requirements, like
> having multiple symbolic command lines for alternatives with largely
> different syntaxes (e.g., a symbolic command line per sub-command),

A gentler version of this is seen in grotty(1), where the availability
of certain options is modal.

grotty [-dfho] [-i|-r] [-F dir] [file ...]

grotty -c [-bBdfhouU] [-F dir] [file ...]

grotty --help

grotty -v
grotty --version

> there are still requirements only expressed in the DESCRIPTION:  One
> option might be optional unless a certain other option is given.

You can get a fair way toward that; see above.

> You wouldn’t write [-a] [-b -a], you’d just write [-a]
> [-b], or rather [-ab].

I might critique this if I understood what you were trying to
express with it.

> If you should follow quite strict rules in you synopses (and you
> should), it still is more or less free-form.

I invite you to review the synopsis suggestions offered in
groff_man_style(7) and point out any ambiguity problems more severe than
those of POSIX's Guidelines conventions.  I didn't have those in mind
when writing them--just general practice, which is chaotic and not
difficult to improve upon.

> Exactly.   Which means:  All we can and something we should do is have
> guidelines.   Saying nothing of what those guidelines should be.

We can say more than that.  If no objective criteria for comparing
synopsis language conventions exist, then let's develop some.

> > I know, and I hate it; the ellipsis doesn't merit italicization any
> > more than option brackets do.  mdoc(7) is expressive enough that you
> > don't actually type the brackets; you use its `Op` macro.  To me, it
> > is telling that, thus having full control of the styling of synopsis
> > brackets, mdoc's author(s) (most prominently Cynthia Livingston, in
> > the macro language's second and final edition) set them in roman.[7]
> 
> I’m inclined to agree, though here I value convention over taste—and
> the convention for mdoc manuals has been from the start to italicize
> the ellipses.

I'm not anxious to sacrifice so much to taste.  :)

> > Thanks for raising this.  The fix was straightforward, and you can
> > expect it in my next push to groff Git.[9]

I may not be willing to die on this hill, but I'm willing to risk being
MEDEVACed from it.  It won't matter, if everybody who really cares about
mdoc(7) is loyal to mandoc(1) unto death, which seems to approximately
be the case.

My goal is that it not be obvious to the causal reader of a man page
whether man(7) or mdoc(7) was used to compose it.  So I'm not tolerant
of gratuitous rendering differences, and in groff 1.23.0 I made changes
to both packages to align them better with each other.  I'm happy if
only an expert man page author can tell without looking at the source.

> Heh, that sneaky spreading of ellipses is funny.

Alex had a comment about that too, and I remember making a mental note
to respond to it, but not actually doing so.  So I will now.  You may
already know this, but on terminal devices, \| (like \^) does not render
as anything at all.  So we lose the interest of maybe 95% of the man
page reading community right away.

Illustration attached, in source and a PNG of rendered PostScript.

> I don’t think it should be there, but whatever.

It's a practice that has at l

Re: [PATCH v2] man*/: ffix (migrate to `MR`)

2023-07-31 Thread Alejandro Colomar
Hi Jakub, Branden, Ingo

On 2023-08-01 00:16, Jakub Wilk wrote:
> * G. Branden Robinson , 2023-07-31 12:52:
>> Use the man(7) macro `MR`, new to groff 1.23.0,
> 
> Given that this version of groff was released approximately yesterday¹, 
> this is very premature.
> 
> NACK from me.

I included that, and the reason, in the commit message.  It's in the MR
branch in my private repo, as I mentioned in a reply to Branden:


My server is HTTP-only, but the commit should be signed with my PGP
signature, so it should be safe to check anything from my git:

$ git show --pretty=fuller --show-signature 
commit d4a22d4645184c205a04477ee84b0ee429fb6200 (HEAD -> MR, alx/MR)
gpg: Signature made Tue Aug  1 01:19:00 2023 CEST
gpg:using RSA key EA3A87F0A4EBA030E45DF2409E8C1AFBBEFFDB32
gpg: Good signature from "Alejandro Colomar " [ultimate]
gpg: aka "Alejandro Colomar Andres " 
[ultimate]
Author: G. Branden Robinson 
AuthorDate: Mon Jul 31 12:52:51 2023 -0500
Commit: Alejandro Colomar 
CommitDate: Tue Aug 1 01:18:59 2023 +0200

man*/: ffix (migrate to `MR`)

[...]

Signed-off-by: "G. Branden Robinson" 
[Jakub has concerns that groff-1.23.0 was released too recently]
Nacked-by: Jakub Wilk 
[alx: Added quote from gbr documenting how he tested for regressions]
Signed-off-by: Alejandro Colomar 


(While preparing this email, I noticed I hadn't noted Branden's
authorship while committing, so I've amended the commit; luckily it
wasn't on kernel.org.  I'm sorry if I caused any inconvenience to
anyone fetching from my repo.)

> 
>> When the text of all Linux man-pages documents (excluding those 
>> containing only `so` requests) is dumped, with adjustment mode 'l' 
>> ("-dAD=l") and automatic hyphenation disabled ("-rHY=0") before and 
>> after this change, there is no change to rendered output.
> 
> That's not what I'm seeing with Debian groff 1.22.4-10 (which seems to 
> have .MR backported).
> 
> After applying the patch, the man page references are typeset in 
> italics, which is ugly and against man-pages(7) recommendations.

I guess he meant no regressions other than the intended formatting
change.  Branden, I find that this isn't really documented in the
commit message and it should be.  We probably thought it was obvious,
but Jakub is right there.

I would be worried if there would remain any difference after removing
formatting, or more precisely, if after configuring MR to do bold
there would remain any differences.

However: Branden, I suggest you content Jakub showing more proof that
there's no regressions, and very explicitly document the intentional
regressions a bit more (basically that we're changing to italics).

Jakub, you (or distributors) can always change the meaning of MR to
perform bold instead of italics.  Just in case you didn't know.
Although if you didn't, maybe it's a sign that it should be more
thoroughly documented in this patch.

I started CCing Ingo in these discussions to let him know that D-day
has come, and we would appreciate mandoc(1) support for `MR`.

Cheers,
Alex

> 
> 
> ¹ More precisely, about a month ago.
> 

-- 

GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5



OpenPGP_signature
Description: OpenPGP digital signature


Re: boldface, italics, spaces and ellipses in synopses of commands, and *nix history

2023-07-31 Thread Alejandro Colomar
Hi Branden,

On 2023-08-01 00:20, G. Branden Robinson wrote:

>>> Thanks for raising this.  The fix was straightforward, and you can
>>> expect it in my next push to groff Git.[9]
> 
> I may not be willing to die on this hill, but I'm willing to risk being
> MEDEVACed from it.  It won't matter, if everybody who really cares about
> mdoc(7) is loyal to mandoc(1) unto death, which seems to approximately
> be the case.
> 
> My goal is that it not be obvious to the causal reader of a man page
> whether man(7) or mdoc(7) was used to compose it.  So I'm not tolerant
> of gratuitous rendering differences, and in groff 1.23.0 I made changes
> to both packages to align them better with each other.  I'm happy if
> only an expert man page author can tell without looking at the source.

Function prototypes are the biggest difference, IMO.  I prefer how man(7)
pages show function prototypes (the type and the variable are formatted
differently).  Though I'll give to mdoc(7) that parentheses and commas in
roman are nice.

.3 pages are easily distinguished in the first screenful of text
without looking at the source, in the SYNOPSIS.

Cheers,
Alex

-- 

GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v2] man*/: ffix (migrate to `MR`)

2023-07-31 Thread Alejandro Colomar
Hi Branden,

On 2023-08-01 00:50, G. Branden Robinson wrote:
> Hi Alex,
> 
> At 2023-07-31T23:47:50+0200, Alejandro Colomar wrote:
>>> When the text of all Linux man-pages documents (excluding those
>>> containing only `so` requests) is dumped, with adjustment mode 'l'
>>> ("-dAD=l") and automatic hyphenation disabled ("-rHY=0") before and
>>> after this change, there is no change to rendered output.
>>
>> It would be interesting to see a script that corroborates the above
>> paragraph.  It might help other projects that may want to migrate to
>> MR.
> 
> Sure.  I used a couple of scripts.
> 
>   $ cat ATTIC/dump-pages.sh
>   #!/bin/sh
> 
>   pages=$(grep -L '^\.so ' man*/* | sort)
>   groff -t "$@" -m andoc -T utf8 -P -cbou $pages
> 
>   $ cat ATTIC/dump-pages-left-adjustment-no-hyphenation.sh
>   #!/bin/sh
> 
>   pages=$(grep -L '^\.so ' man*/* | sort)
>   groff -t -dAD=l -rHY=0 -m andoc -T utf8 -P -cbou $pages
> 
> And here's how I ran them.
> 
>   sh ATTIC/dump-pages.sh >| DUMP1
>   sed -i -f ./ATTIC/MR.sed $(grep -L '^\.so ' man*/*)
>   sh ATTIC/dump-pages-left-adjustment-no-hyphenation.sh >| DUMP2
>   diff -U0 -b DUMP1 DUMP2 | less -R
> 
> That confirmed that there were "no changes" (with the caveat noted
> above).
> 
>   sh ATTIC/dump-pages.sh >| DUMP2
>   diff -U0 -b DUMP1 DUMP2 | less -R
>   diff -U0 -b DUMP1 DUMP2 | wc -l
> 
> I used these to eyeball and measure whether there were any formatting
> changes even with default adjustment and hyphenation enabled.  It showed
> me _tons_ of man page names no longer getting broken (and hyphenated)
> across lines, and nothing else that I noticed.
> 
> With the previous empty diff in hand, I decided that I hadn't regressed
> the text of the pages.
> 
> If there are further sanity checks we can apply, I'm open to
> suggestions.

Nah, I eyeballed random samples the diff and it looked good.  That, and
your extensive tests, make me confident enough.  If we screwed anything,
we can fix it.

The only concern I had some time ago was with code like exit(1), but
that should be using italics today, so it shouldn't be a problem.  I
can't imagine big issues.

> 
> Since you had me looking at my shell history, I'll share that I did a
> "git co ." (co = alias for "checkout") 18 times in the course of
> developing MR.sed.  Those drove most of my recent patch submissions
> immediately prior to this one.  I could have done 18 more without
> fatiguing (albeit not necessrily without frustration with myself for not
> getting my sed right).  But that's the beauty of sed, and
> Bash/readline's "reverse-search-history" and "operate-and-get-next"
> features.
> 
> As it turned out, my sed was pretty good, except for the missing use
> case you identified, and my fix for which worked on the first try.  The
> irregularity of the page inputs was the tricky bit.
> 
> At one point I had a fearful episode that I'd misdesigned `MR` for one
> scenario, and much like the Master being terrorized by the Keller
> Machine, I had visions of the Doctor (Ingo Schwarze) laughing at me and
> telling me he told me so and winning the whole world over to mdoc(7) in
> one stroke.  But it was fine (attached).
> 
> There are _still_ some `ad` requests scattered around (outside of tbl(1)
> text blocks), but I didn't go after those because they weren't in the
> way of my objective.  Eventually it'd be good to scrub those too.
> 
>>> I prepared this change with the following GNU sed script.
>>>
>>> \# Handle simplest cases: ".BR foo (1)" and ".IR foo (1)".
>>
>> What I do to avoid git messing with these comments is to write a
>> leading space.  For git, only '#' in column 1 are special.  Since most
>> compilers and interpreters allow a space before a commented line, a
>> leading space is fine.
> 
> Ahh.  A leading backslash is the only workaround I've ever noticed.
> 
>> I've edited the commit message to have spaces, so that it's directly
>> pastable into a MR.sed script.  Oh, and I included "$ cat MR.sed;" in
>> the commit message; I couldn't not do it.  :)
> 
> No worries. :)
> 
>> I've applied the patch (or rather, the script), but won't push it yet.
>> If you send a run of commands that prove no differences before and
>> after, I'll amend the commit message with it.
> 
> Please do verify it yourself with the tools above (or better ones).  I'm
> well aware that this is a huge change that can make people nervous.

I applied the patch, amended the message with a quote from this email,
and pushed to the MR branch in my private git repo at
.

Oh, and I also removed a few pages from your patch, per CONTRIBUTING
guidelines:

Notes
   External and autogenerated pages
   A few pages come from external sources.  Fixes to the pages
   should really go to the upstream source.

   tzfile(5), tzselect(8), zdump(8), and zic(8) come from the tz
   project .

   bpf-helpers(7) is autogener