Hi folks,
In the last few weeks, there's been some confusing mention of manpages
on this list. Confusing because some of the issues raised, in my eyes,
aren't really issues at all. So I thought I'd pipe up in the hopeful
interests of clarity.
To begin on familiar ground, by manpages I mean man(7), which simplifies
manpages in the style of the original UNIX Programmer's Manual; and
mdoc(7), written for a similar purpose but with hindsight of man(7)'s
ambiguity, such as how to format variable names or structure the manpage
header. mdoc(7) is necessarily more complicated than man(7), but it
significantly relieves authors of stylistic improvisation. Then there's
-Tascii for man(1) and -Tps for dead trees, etc.
(By the way, I was a very small boy when mdoc(7) was written and had
nothing to do with it. Maybe somebody has a handle on one of the
original authors and can corroborate its origins? We already have
http://manpages.bsd.lv/history.html... maybe the same but for macro
packages?)
The confusion in these list threads, as I see it, begins when browsers
are brought into the classical mix of man(7), mdoc(7), -Tps, and
-Tascii. What's also confusing is semantics and the web in general.
Browsers are confusing because HTML doesn't play with character-driven
media. And roff(7), into which groff(1) translates man(7) and mdoc(7),
is (significantly?) character-driven. We hack around this by converting
-Tascii output into <pre>-wrapped documents. But that's not really HTML
and makes browsers cry.
One solution is to disregard roff(7) and regard only man(7) and mdoc(7).
mandoc(1) does this. It gets away with it because it's built
specifically (and in a way, dumbly) just for man(7) and mdoc(7) and just
enough roff(7), tbl(7), etc. groff(1) is far broader in scope, and
consumes roff(7) as a whole. So it can't exploit this simple trick.
It was suggested that groff(1) be taught a subset of roff(7) that can
map into a tree structure, then compile that further into HTML. If this
is possible (it sounds hard and/or awesome), and if somebody pulls it
off and modifies the existing macros to use the "clean" roff(7), then
groff(1) would map beautifully into HTML and not care whether its input
is mom(7), mdoc(7), or man(7) so long as the underlying tmac file has
been properly treated. That's a lot of work: identifying the relevant
roff(7) macros, then teaching groff(1) to extract a syntax tree from
those macros, then doing something with that syntax tree, then modifying
the macro packages. But it sounds, to my uninformed ear, possible.
Unfortunately, that's only half of the confusion. The other half is
"semantics".
Even if groff(1) could do as above, and somehow carry over the original
macro language's "meaning", it'd be only as good as its input language.
To wit, Eric proposed extending man(7) with semantics to address
exactly that. And that would give us... another mdoc(7).
While I agree that mdoc(7) is no semantic saint--sometimes it goes too
far, sometimes not far enough--it exists right now, has considerable
support and inertia, many eyes on macros and renderings, and has
demonstrable proof of capability. mandocdb(8), via mandoc(3), dumps
manpages' semantic content into Berkeley or SQLite databases. (Ingo,
who's captaining mandoc, can speak better on its status, as well as
-Thtml and friends.)
And how exactly would groff(1) profit from a new macro language? At the
very least, it'd require a whole new macro package to maintain. And
groff(1) still wouldn't be able to understand semantics without "clean"
roff(7) and considerable work on internals.
And how would the community as a whole benefit? As a language, the new
man(7) wouldn't be much different from mdoc(7). And then there's
balkanisation: we already have two language for manpages. You're
proposing another?
If semantics and browsers are the future of manpages, then we already
have real, working solutions. We have mdoc(7). And there's at least a
credible plan on modifying groff(1) to support a clean roff(7) which
could be used by both man(7) and mdoc(7). mandoc(1) can already do
this: you can hook into mandoc(3) today and see for yourself.
So in short, why not throw more weight behind mdoc(7) instead of
reinventing the wheel?
If groff(1) gains "clean" roff(7) capabilities, it could hook into
mdoc(7) and man(7) as they live today. There's no need for yet another
language--we already have one that works for many users. And if we find
issues, we can collectively consider how to grow it with the knowledge
of thousands of existing mdoc(7) pages, and the good folks in the BSD
systems who work with them, and their -Thtml output, on a daily basis.
Everybody wins.
Thoughts?
Best,
Kristaps
- [Groff] Manpages, groff, and the browser. Kristaps Dzonsons
-