Hi Branden,

On Tue Jan 21, 2025 at 6:59 AM CET, G. Branden Robinson wrote:
> At 2025-01-20T01:48:19+0100, onf wrote:
> > Actually, BSD mandoc does implement this, it's just documented at
> > a poorly visible place in the docs. BSD mandoc's man(1):
> >   MANPAGER
> >       Any non-empty value of the environment variable MANPAGER is
> >       used instead of the standard pagination program, less(1).  If
> >       less(1) is used, the interactive :t command can be used to go
> >       to the definitions of various terms, for example command line
> >       options, command modifiers, internal commands, environment
> >       variables, function names, preprocessor macros, errno(2)
> >       values, and some other emphasized words.  Some terms may have
> >       defining text at more than one place.  In that case, the
> >       less(1) interactive commands t and T can be used to move to the
> >       next and to the previous place providing information about the
> >       term last searched for with :t.  The -O tag[=term] option
> >       documented in the mandoc(1) manual opens a manual page at the
> >       definition of a specific term rather than at the beginning.
> > 
> > And it works quite nicely, actually. The definitions are generated
> > automatically, so all manpages written in mdoc benefit from it.
> > I assume groff mdoc + man-db doesn't implement this?
>
> I'm working on it.

[rearranging]

> There are a few remaining problems to be solved.
>
> A.  Generation of _unique_ hyperlink tags from #2-#4 above.  There will
>     be collisions galore under item 2 when multiple man pages are
>     rendered.  A page can conceivably collide with itself with respect
>     to items #3 and #4.  So we probably want a hierarchical
>     tag representation: page-name/section/subsection/tag-item, where
>     this structure is truncatable at any point after the first slash but
>     is otherwise invariant.
>
> B.  We need a predictable means of generating hyperlink tag identifiers
>     that is also flexible enough to accommodate non-English languages
>     and weird characters that people might populate their (sub)section
>     titles or paragraph tags with. [...]

You seem to be talking about HTML or PDF links. As a matter of fact, the
only time I turn to HTML manpages is when I don't have one locally, and
the only time I turn to a PDF one is when I want to print it, so PDF
links in particular have little value to me. That's just my experience,
though.

What I was talking about were less(1) tags, which are much more useful
than HTML or PDF links, because they significantly ease navigation of
manpages in the terminal, which is THE way manpages are read and should
thus be the primary focus. You don't seem to mention plans to support
them.

> [requoting]
> > The definitions are generated automatically
>
> That's the rub.  We need a design for automatic construction of
> tag/anchor names from the user-specified names of the items to be
> tagged.  In man(7) documents, those taggable items are probably going to
> be:
>
> 1.  the identifier of the page itself, with "section" number;
> 2.  section heading text;
> 3.  subsection heading text; and
> 4.  the tag text of tagged paragraphs (`TP`).
>
> Item #1 has already been done for several months and works fine; it can
> be observed in any "groff-man-pages.pdf" document built from Git.
> Cross-references between man(7) and mdoc(7) are supported.

Don't get me wrong, it's nice that one can link to a specific
(sub)section of an HTML manpage, but that's completely missing the
point of the feature I was actually talking about, which is the ability
to jump to the definition of a term in the manpage. For instance, when
I want to see the description of register .j in groff(7), I have to do:
  /\\n\[\.j

to locate it. If it was written in mdoc, I could simply do:
  :t .j

Besides being shorter, I wouldn't know to search for \\n\[\.j had I not
already known how groff(7) is written, whereas I would know to search
for .j.

> C.  We then need a way to make references to these anchors/tags.  For
>     man(7) the `MR` macro new to groff 1.23 was an obvious site to add
>     the appropriate machinery for document-level links.  mdoc(7)'s `Xr`
>     is closely analogous and has existed for many years.  In the
>     forthcoming groff 1.24 (and in Git right now), they automatically
>     supply hyperlink information for output devices that support such.
>     (Just PDF and terminals.)
>
>     But there remain two gaps.
>
>     i.  No way to hyperlink in a more fine-grained way, that is to
>         (sub)section headings or, conceivably, to paragraph tags.  This
>         is a tougher problem because if these are not unique within a
>         page, the location making the link has to know about the
>         structure of the document.  Possibly, we'll just punt on the
>         issue of "deep" cross-document links.
>
>         mdoc(7) doesn't bother to support that; its `Sx` macro doesn't
>         contemplate pointing into another document.[4] I notice that it,
>         too, doesn't address the problem of duplicate heading names and
>         therefore ambiguous references.

I agree it would be nice if one could link to subsections and, more
importantly, terms within other manpages. As a matter of fact though,
man(7) can't even tag terms within the same page.

I've spent some time writing mdoc(7) lately while working on a reference
for neatroff, and I guess I just don't get why anyone is using man(7)
anymore. I'm not saying mdoc is perfect; it certainly doesn't afford me
the level of control I am used to from writing plain *roff, but it pays
off in the language's descriptiveness, relative ease of use, and
especially the terms/tags, which are incredibly practical. I can imagine
the language being much more approachable for people lacking knowledge
of *roff, too.

>         Because mdoc(7) culture is
>         rigidly prescriptive, its section headings are tightly
>         controlled, and I expect that this problem only threatens when
>         subsections are used (and referenced).

Although mdoc(7) says something to the effect of:
  For a list of conventional manual sections, see MANUAL STRUCTURE.
  These sections should be used unless it's absolutely necessary
  that custom sections be used.

...in reality it itself uses non-standard section headings:
  Name
  Description
  Manual Structure
  Macro Overview
  Macro Reference
  Macro Syntax
  Compatibility
  See Also
  History
  Authors

I think the point is more about sticking to conventional section names
if possible than about forbidding non-standard ones.

> As I understand mandoc(1)'s less(1)-integrated tagging feature, none of
> the problems above are mitigated by feeding the pager an auxiliary tags
> file (less(1)'s `-T` option). [...]

The tags file allows multiple tags with the same name, which can then be
navigated using the t (next tag) and T (previous tag) commands.

~ onf

Reply via email to