Why groff ms doesn't completely support historical documents

G. Branden Robinson Sat, 05 Oct 2024 22:53:43 -0700

Someone on the TUHS list mailed me privately, prompting me to
write this lengthy apology (in the classical sense) of why groff doesn't
make a certain application easier.  I have slightly revised my response.


This message also may serve as a summary of the challenges that need to
be overcome if someone else wants to tackle the job, and potentially
contribute it to groff.

[person creates PDFs of historical Unix documents (many of which are
written using the ms macros) and wishes groff ms made the task easier]

I sympathize.  I sometimes render historical documents, so I prescribed
in groff ms's documentation the approach that I take myself.  I decided
against trying to support a "-matt" or "-msatt" option in groff because
it's flatly impossible to know which definition of `UX` to use.  Even a
date declaration in the document sheds little light, as we then have to
consider the question of whether we want fidelity to the actual state of
the mark at the time of that declared date, or to what would have been
rendered in the author's environment--and they may have been using an ms
that wasn't "up to date" in the same respect.  That information, too, is
not recorded in the document.[1]

Providing all the macros _except_ `UX` didn't seem likely to satisfy
users since that's the most important one!  It shows up in body text
whereas all the others seldom do--if you can live without the cover page
then, often, you're golden.  Except for `UX`.

Finally there is the name collision problem with Berkeley.  4.2BSD and
later ms defined `CT` and `TM` macros (aspects of their "thesis mode")
and once again there's no declarator within the document to tell you
which dialect of ms is in use.  This one can be heuristically figured
out with pretty good odds, I suspect, but troff works as a filter--what
was I going to do, write a preprocessor just for this?

(Hmm, maybe grog(1) could do it, and that would be in its wheelhouse.
But there's no point until and unless we reimplement support for
Berkeley thesis mode in the first place [so that grog has an option
argument to report], and that is an undertaking I have demurred.[2])

It seemed like a moderate amount of work for almost zero upside.  It's
also hard to validate/verify my work.  The only historical troffs to
which I have access are Seventh Edition Unix troff (1979, before
Kernighan) and DWB 3.3 (early 1990s).  It's a right pain in the butt to
inspect typesetter output on V7 because I have nothing that emulates a
C/A/T or translates it to device-independent troff output for a
"ditroff"-style device description that Kernighan troff, DWB/Hierloom
Doctools troff, or GNU troff could use.

And even if I had either of those, they'd have to be vetted to a _high_
degree of quality before they'd be fit for purpose; else I wouldn't know
whether I was chasing bugs in the groff ms macros or the C/A/T
emulator/translator.

So, to summarize, I confine my compatibility efforts to _nroff_ output,
and rule the Bell Labs "site" macros out of scope.  I feel there is not
much more I can do, and have confidence my results, without resources
that I'm lacking.

I hope this sheds some light on my reasoning.

Regards,
Branden

[1] Still, if someone wants to start, I'd start here.

    https://minnie.tuhs.org/cgi-bin/utree.pl?file=V10/vol2/ms/tmac.s

[2] One person, ever, has requested it, 20 years ago.  And I have no
    specimens of input or corresponding model output rendered by an
    "authentic" BSD troff [formatter executable PLUS support files]
    against which to develop a reconstruction.  (On the bright side, the
    Berkeley modifications to the once-encumbered AT&T "tmac.s" are, of
    themselves, presumably BSD-licensed.)

    https://savannah.gnu.org/bugs/?64455

signature.asc
Description: PGP signature

Why groff ms doesn't completely support historical documents

Reply via email to