Re: Modernising UNIX manpages.

2021-04-22 Thread Marc Chantreux
hello,

> I would like to investigate the possibility of using Markdown as an alternate 
> format for UNIX man-pages.
> (Cf. https://github.com/marcastel/marcastel/discussions/7)

I used POD (perldoc) for decades and i'm very fan of pandoc for many
years now (i use it for many things from bills to papers) so i was
i the same mood years ago and started to use sdoc. Someone shared with
me about the bad quality of its output so i learned the basics of roff
to fully understand.

Then i realized that once you discover mandoc, you really have no need
of another tool. I mean: instructions like

 .Sh NAME
 .Nm progname

are:

* easy to learn and write
* push semantic into the document
* is grepable (easy to index/search)

There is no way you'll manage requests like .In, .Fn .Vt, .Ft, .Fn, .Op
with commonmark. I would bet on pandoc (the only one markdown dialect
accceptable to me) but you will need a custom filter there. There is
also an ms writer that could be a start to write a mandoc one.

*But* if there is a tool that just read a code written in (name your prefered
langage) and write the mandoc output (eventually use comments for the
other parts), i really would give this tool a try.

> I would like to devote time to this in the second semester of 2021 and
> would appreciate sharing this.

thanks for that! if you do something around pandoc, i definitely have a
look on it.

> I believe the first step is to provide a proof of concept what
> demonstrates the expected outcome and that desired command line
> interface.

if i would like to start a project like yours nowadays, i definitely
would start by code2mdoc because it's the only part i would use
(as i said: i don't know why i would use an external tool for the rest
because there is not much caracters to spare).

good luck for this project
marc



Re: Modernising UNIX manpages.

2021-04-22 Thread Marc Chantreux
Le Thu, Apr 22, 2021 at 02:09:18PM +1000, John Gardner a écrit :
> Markdown has one feature: readability. That's literally it.

IMHO, the mdoc dialect is easier to read when it comes to describe
functions, types and so on ... markdown on the other hand is quicker
to write.

regards
marc



Re: Modernising UNIX manpages.

2021-04-22 Thread Marc Chantreux
> Mhmm, what `pandoc` provides is quite nice, and I have successfully
> used it to produce technical manuals in various formats.

pandoc is vastly superior to all their concurrents (except Rmarkdown
which seems to be interesting but i haven't tested by myself) but the
problem is many people use other implementations that implements
incompatible subsets of markdown itself.

commonmark is a specification for markdown but doesn't fit the needs
of this project (does it fit any need but readme.md?).

regards,
marc



Re: Modernising UNIX manpages.

2021-04-22 Thread JM Marcastel



> On 21 Apr 2021, at 18:56, Eric S. Raymond  wrote:
> 
> 1. Sorry, Markdown is a *terrible* choice.  Which dialect? It's simply not 
> standardized enough.
> It's also semantically rather weak, especially near tables.

I was expecting that one :-)

Let me put differently. Markdown is the style. YAML is the structure. Combine 
them and you have a human friendly and modern (structured) markup language 
(sorry Goldfarb).

Want I call Markdown is Pandoc markdown — and CommonMark since John MacFarlane 
is behind both and the standard is maturing fast. (BTW Pandoc does a good job 
at tables).

Rather than putting YAML into Markdown, as is commonly the case — typically for 
static site generators, put Markdown in YAML. That gives you structure (and 
since JSON is a subset of YAML, you already have an interchange format that can 
be easily manipulated in any program). All that remains to be done is to build 
a *schema* for manpages using that construct. This approach is coming of age in 
the API world (RESTful APIs in particular). Setting up a formal structure in 
YAML is easy. Beyond being manpage-compatible I easily imagine translators to 
generate the input for GNU’s `gengetopt` or for KornShell’s embarked (and 
excellent though a little obscur)  `getopts` auto generated man pages (already 
at the time in a variety of formats (try options —nroff, —html, —api on any 
KornShell script using getopts or on the ksh binary itself).

As mentioned in my original discussion 
, I have a clear idea on 
how to deliver a proof of concept (with Pandoc (Haskell) and Glow (Go)). Though 
in my mind target utility would be written in C (or Go). Pandoc exposes an 
abstract syntax tree of both YAML and Markdown. Glow saves us from having to 
write a ncurses interface by already providing on-terminal display of essential 
Markdown markup (including tables). What I am missing is exactly what triggered 
reactions against Markdown: the structure, the constraints… where we mark the 
line between free form layout in Markdown (e.g. cross referencing manpages in 
text) and structure imposed through our YAML skeleton.

A side by side comparison of the current manpage markup and the to-be markup. 
How to maintain backward compatibility while providing new features (tables, 
modernise the display with clickable hyperlinks 
, images 
, and other goodies on 
VTE-capable  terminals). Etc.

IMHO Markdown and YAML are the way to go, but should another format be better 
suited I have no objection. Despite my humble participation in such initiatives 
as ISO 8613 (in the late 80s already!), I am convinced we should steer away 
from DocBook and the likes (even ASCIIdoc is too…. technical). Yesterday’s 
manpages were managed centrally by highly knowledgeable persons. Tomorrow's 
manpages should solve the MANPATH hell, one should be able to easily correct or 
amend them, and they should always be up to date, both on the terminal and 
online. People of all skills and origins should be able to contribute. For this 
we need the Tim Berners Lee HTML-attitude… simple (but not simplistic) markup.








Re: Release Candidate 1.23.0.rc1

2021-04-22 Thread Dave Kemper
On 4/10/21, G. Branden Robinson  wrote:
> I share Bjarni's inclinations on this point, but there's a nearly
> 50-year tradition of composing roff documents without sweating the small
> stuff.  According to our troff(1) page and as the saying goes, "most of
> it" (warnings) "is small stuff".

Agreed on both points.  It is similar with Perl (though its tradition
is a mere 30 years old): any serious Perl developer will tell you to
always turn on warnings.  But they remain off by default so as not to
clutter the stderr of users throwing together simple, disposable
scripts.

> True, although this is an in-tree-only tool; I don't view such a
> requirement as being a significant barrier in a developer-facing
> program.

True, but it does require:

1. ...the user to be aware that this is even a step that needs to be
done.  For instance, a clueless user named "Dave" recently wrote to
this list complaining about unexpected stderr output, unaware that
test-groff was setting these flags.  On the other hand, if test-groff
worked the same way as regular groff, a hypothetical clueless
test-groff user who WANTS this extra stderr output need only read the
existing groff documentation to learn how to get it.

2. ...users who periodically build and test new groffs to edit this
script every time a groff is built.  While not an insurmountable
burden, it's certainly more of one than that faced by the
alternate-universe user of a test-groff that does not set these
options but who wants these flags set all the time: this user need
only create an alias including them.  This latter action is also a
standard Unix way of quasi-permanently setting certain flags.  The
edit-the-script-every-time system is less standard in addition to
being more work.

So it's not so much that it's a significant barrier as it is that it's
a higher barrier than the alternate, which follows Unix tradition more
closely.

> If -w, -W, or -b affect the way the standard output stream is produced
> in any way[1], that's a fire tornado of a bug irrespective of
> test-groff's existence.

Yes -- and at present that can't even be tested for via test-groff,
since it makes -b always on and there's no command-line override.



Re: Modernising UNIX manpages.

2021-04-22 Thread JM Marcastel



> On 22 Apr 2021, at 09:35, Marc Chantreux  wrote:
> 
> There is no way you'll manage requests like .In, .Fn .Vt, .Ft, .Fn, .Op
> with commonmark.


Food for thought… though in my opinion the usage strings and the synopsis are 
like a table of contents, automatically generated.

How about:

#usage filter [-flag]  

In lieu of:

.Nm filter
.Op Fl flag
.Ao Ar infile Ac Ao Ar outfile Ac


Or:

#usage make [-eiknqrstv] [-D variable] [-d flags] [-f makefile] [-I directory] 
[-j max_jobs] [variable=value] [target ...]

In lieu of:

.Nm make
.Op Fl eiknqrstv
.Op Fl D Ar variable
.Op Fl d Ar flags
.Op Fl f Ar makefile
.Op Fl I Ar directory
.Op Fl j Ar max_jobs
.Op Ar variable Ns =Ns Ar value
.Bk
.Op Ar target ...
.Ek


And in pure Markdown with no customisation (and to be a little cheeky), how 
about

``` .c
int res_mkquery(int op, char *dname, int class, int type, char *data, int 
datalen, struct rrec *newrr, char *buf, int buflen)
```

In lieu of:

.Ft int
.Fo res_mkquery
.Fa "int op"
.Fa "char *dname"
.Fa "int class"
.Fa "int type"
.Fa "char *data"
.Fa "int datalen"
.Fa "struct rrec *newrr"
.Fa "char *buf"
.Fa "int buflen"
.Fc










Re: Modernising UNIX manpages.

2021-04-22 Thread JM Marcastel



> On 22 Apr 2021, at 18:37, JM Marcastel  wrote:
> 
>> On 22 Apr 2021, at 09:35, Marc Chantreux > > wrote:
>> 
>> There is no way you'll manage requests like .In, .Fn .Vt, .Ft, .Fn, .Op
>> with commonmark.
> 
> 
> Food for thought… though in my opinion the usage strings and the synopsis are 
> like a table of contents, automatically generated.

FWIW playing around with the bare bones template in `groff_mdoc(7)` here are a 
few files to illustrate my thoughts.

skeleton.yaml — a first attempt a what the format could look like
skeleton.{hcl,json,kaml,toml, ast} — outputs as generated by a tool of mine

Here is a glimpse at the first file:

--- # The follow structured is required for all man pages
revision  : 2021-04-22 (Thu) 19:22:58
name  : 
brief : 
title : 
volume: 
system: 
release   :  
status: draft
sections  :
  - name  : NAME# This section is automatically generated
# if you do not fill it in

  - name  : LIBRARY # For MAN sections 2 and 3 only

  - name  : SYNOPSIS# This section is automatically generated
# for command line interfaces

  - name  : DESCRIPTION #   
body  : |

A **man page** (short for **manual page**) is a form of [software
documentation] usually found on a [UNIX] or [UNIX-like] [operating
system][os].

![A diagram showing the key Unix and Unix-like operating 
systems][UNIX-chart]

and a (simple) table for Eric :-)

| Option | Description |
| -- | --- |
| data   | path to data files to supply the data that will be 
passed into templates. |
| engine | engine to be used for processing templates. Handlebars 
is the default. |
| ext| extension to be used for dest files. |


  [software documentation]: 
https://en.wikipedia.org/wiki/Software_documentation
  [UNIX]: https://en.wikipedia.org/wiki/Unix
  [UNIX-like]: https://en.wikipedia.org/wiki/Unix-like
  [os]: https://en.wikipedia.org/wiki/Operating_system
  [UNIX-chart]: 
https://en.wikipedia.org/wiki/Unix#/media/File:Unix_history-simple.svg

  - name  : DESCRIPTION # Use free form (Pandoc) Markdown
  - name  : IMPLEMENTATION NOTES# Where necessary
  - name  : RETURN VALUES   # For sections 2, 3 and 9 functions
  - name  : ENVIRONMENT # For sections 1, 6, 7 and 8 only.
  - name  : FILES
  - name  : EXAMPLES
  - name  : DIAGNOSTICS # For sections 1, 6, 7, 8 and 9 only
  - name  : COMPATIBILITY
  - name  : ERRORS  # For sections 2, 3 and 9
  - name  : SEE ALSO
  - name  : STANDARDS
  - name  : HISTORY
  - name  : AUTHORS
  - name  : BUGS

---


The `skeleton.ast` is an abstract syntax tree modelled after the one Pandoc 
provides (and could be aligned to the C implementation of CommonMark).

Producing the AST with pandoc is simply a matter of : `pandoc -i skeleton.yaml 
-o skeleton.ast -f markdown -t json`

My utility is a very simple wrapper around `libyaml` which enables direct 
manipulation of YAML/JSON/TOML in KornShell scripts by converting them to (KAML 
 >).

Generating, from a KornShell script, outputs in nroff, HTML, PDF (or any of 
Pandoc’s 59 supported output formats — see pandoc —list-output-formats) is 
simply a matter of:

print -v body | pandoc —template manpage. —to 




y2k -- convert a YAML stream or file into a KAML equivalent.
Usage: y2k [-a] [-c] [-d] [-f] [-k] [-m] [-p ] [-u] [-v] 

  -a output mappings as associative arrays
  -c output in the canonical YAML format
  -k output KornShell rather than KAML
  -d enable the debug mode
  -f only parse frontmatter YAML in Markdown documents
  -m minify output (disables verbosity) [experimental]
  -p   assign output to variable , appropriately prefixed
 with a `typeset` (implies -k)
  -u output unescaped non-ASCII characters
  -v increase the verbosity level (max: 3)

For illustration, I attach a couple of screenshots of terminal rendered 
Markdown using glow (the file is Pandoc’s manpage 
 (in Markdown which is 
automatically converted to nroff))

Note: Glow’s markdown syntax is currently too limited for our purposes. The 
output is currently crippled in places. But the utility’s design (Charm 
library) allows easy expansion and could be plugged directly to the AST.














Problem with mom's .PDF_WWW_LINK macro

2021-04-22 Thread T. Kurt Bond
Here's a mom source file I cut down from contrib/mom/examples/mom-pdf.mom
in the groff distribution, from a very recent git pull.

.PAPER Letter
.PRINTSTYLE TYPESET
.TITLE "Testing mom's PDF_WWW_LINK macro"
.char \[pdfmom]   \*[BD]pdfmom\*[PREV]
.char \[-P-p]  \*[BD]\-P\-p\*[PREV]
.char \[-Tpdf]\*[BD]\-Tpdf\*[PREV]
.START
.PP
One reason to prefer using the native PDF driver (via \[pdfmom] or
\[-Tpdf]) is that papersizes set within mom source files (see
.PDF_WWW_LINK
http://www.schaffter.ca/mom/momdoc/typesetting.html#page-setup-intro SUFFIX
) \
  "paper and page setup macros"
do not require a corresponding \[-P-p] flag on the
command line.

In the PDF file I generated from this (attached to this message) the URL in
the link is:

http://www.schaffter.ca/mom/momdoc/typesetting.html%23page-setup-intro

Notice the "%23" instead of the "#"?  Following this link results in a 404
error.

Why did this happen?

groff --version reports "GNU groff version 1.23.0.rc1.340-0dab6".
-- 
T. Kurt Bond, tkurtb...@gmail.com, https://tkurtbond.github.io


mom-link.pdf
Description: Adobe PDF document


Re: Modernising UNIX manpages.

2021-04-22 Thread JM Marcastel



> On 22 Apr 2021, at 09:35, Marc Chantreux  wrote:
> 
> *But* if there is a tool that just read a code written in (name your preferred
> langage) and write the mandoc output (eventually use comments for the
> other parts), i really would give this tool a try.


Perhaps Markdown is not the appropriate markup. But neither is mdoc (at least 
in my current understanding of its scope and capabilities).

Choosing an alternate syntax is not simply about enhancing the user’s on-screen 
experience with images, tables, and other goodies now possible on terminals.
Though the snapshot below would obviously bring new documentary capabilities to 
manpages.

If all that can be done in mdoc, then fine, I am willing to give a try… even if 
a little frustrated by an antiquated markup syntax.

A great achievement, in my view, would be to bring into UNIX manpages some of 
the achievements of Texinfo. Unsorted:
- Content structure (not simply outlining)
- Efficient cross-referencing
- Lists and tables
- Special displays (floats, images, footnotes)

And of course, a fantastic info tool that knows how to take advantage at build 
time of all the contextual information we have about the document to help you 
navigate.
We are not simply piping a man page through a pager, we are navigating the 
documents.

Once we have that extra data available, somewhere (mandb?) we can further 
enhance the pertinence of our manpages.
Extra utilities taking further advantage of this (consolidate, aggregation, 
dependency graphs,...).

The start of all this, is the markup.

Don’t you think it is about time to do, if not a quantum leap, at least some 
modernisation ?
(Re)make UNIX manpages the “indispensable masterpiece for system users and 
developers” (to paraphrase Dash )

P.S. The snapshot below cost me approximately 10mn (Markdown abstract tree + 
script to base64 encode the image and wrap it in a ANSI escape code for the 
terminal).
And this is by no means dedicated, since the same Markdown can be used to 
generate HTML, PDF, and more.
How long would it take to incorporate such functionality with mdoc ?




Re: Problem with mom's .PDF_WWW_LINK macro

2021-04-22 Thread T. Kurt Bond
(Peter, I CCed this to the groff list since my original message went there,
and this way the resolution is in the mailing list archives.)

I tried your wonky-char.mom version with the groff that I built (GNU groff
version 1.23.0.rc1.340-0dab6, from a recent git pull) and got exactly the
same problem: when I hovered over the link, I saw the "%23" instead of the
"#", and when I copied the link from the PDF (using right-click > Copy
Link) I got the URL with the "%23", and when I clicked on the link I got a
404 error in my browser.

This was in Preview on macOS.  I tried it with Acrobat Reader DC on macOS,
and it worked fine.  I tried it in PDF Expert, and it worked fine.  I tried
it in mupdf on OpenBSD and it worked fine.

I'm convinced this is a bug in Apple's macOS Preview application.

Sorry for adding to the noise.
-- 
T. Kurt Bond, tkurtb...@gmail.com, https://tkurtbond.github.io