[adding groff@gnu to CC list]

At 2025-03-29T15:57:16+0000, Bjarni Ingi Gislason wrote:
> Package: twm
> Version: 1:1.0.10-1+b1
> Severity: minor
> Tags: patch

This report would be more appropriately sent upstream, to
freedesktop.org.

>    * What led up to the situation?
> 
>      Checking for defects with a new version
> 
> test-[g|n]roff -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=10 -ww -z < "man 
> page"
> 
>   [Use "grep -e ' $' -e '\\~$' <file>" to find obvious trailing spaces.]
> 
>   ["test-groff" is a script in the repository for "groff"; is not shipped]
> (local copy and "troff" slightly changed by me).

You should frankly disclose in the many bug reports like this that you
file that you are not using GNU groff, but your private fork of it,
which has many patches, only some of which are known to the public
because you don't publish your source code.  (If you don't distribute
binaries, you're not required by the GNU GPL to publish source.)

In the Savannah bug tracker for GNU groff, we term your fork
"bjarnigroff".

Overall, your suggestions should distinguish syntactical or validity
problems from style recommendations.  In this report, it looks like
_only_ stylistic recommendations are being made, a point I think you
should make more emphatically.

An exception exists where your recommendation will _produce_ invalid
input (or unwanted output) with some formatters.  See below.

>   [The fate of "test-nroff" was decided in groff bug #55941.]
> 
>    * What was the outcome of this action?
> 
> troff:<stdin>:377: warning: trailing space in the line
> troff:<stdin>:646: warning: trailing space in the line
> troff:<stdin>:731: warning: trailing space in the line
> troff:<stdin>:761: warning: trailing space in the line

This output arises from one of your patches.

A "style" warning category is contemplated for a future version of (GNU)
groff.

https://savannah.gnu.org/bugs/?62776

>   The amount of space between sentences in the output can then be
> controlled with the ".ss" request.

This advice applies only to groff's startup files, _not_ to man page
content.  Please update your boilerplate text to include that caveat.

groff_man_style(7)[0]:

Files
[...]
     /etc/groff/man.local
            Put site‐local changes and customizations into this file.

                   .\" Put only one space after the end of a sentence.
                   .ss 12 0 \" See groff(7).
                   .\" Keep pages narrow even on wide terminals.
                   .if n .if \n[LL]>80n .nr LL 80n

            On multi‐user systems, it is more considerate to users whose
            preferences may differ from the administrator’s to be less
            aggressive with such settings, or to permit their override
            with a user‐specific man.local file.  Place the requests
            below at the end of the site‐local file to manifest
            courtesy.
                   .soquiet \V[XDG_CONFIG_HOME]/man.local
                   .soquiet \V[HOME]/.man.local
            However, a security‐sandboxed man(1) program may lack
            permission to open such files.

> Split lines longer than 80 characters into two or more lines.
> Appropriate break points are the end of a sentence and a subordinate
> clause; after punctuation marks.
> Add "\:" to split the string for the output, "\<newline>" in the
> source.  

(Bjarni's bug report: warning: trailing space in the line)

This advice is not applicable to AT&T troff, nor to some of its
descendants, an important fact for projects that support deployment to
surviving System V Unix-based systems like Solaris 10 and (maybe) HP-UX.
On those systems, an unwanted literal ":" will appear in the output.

twm, to cite the subject of _this_ bug report, likely _does_ wish to
retain such portability, given its long history as part of X11.

groff_man_style(7):

   Portability
[...]
     \:        Insert a non‐printing break point.  A word can break at
               such a point, but a hyphen glyph is not written to the
               output if it does.  The remainder of the word is subject
               to hyphenation as normal.  You can use \: and \% in
               combination to control breaking of a file name or URI or
               to permit hyphenation only after certain explicit hyphens
               within a word.  See subsection “Hyperlink macros” above
               for an example.

               \: is a GNU extension also supported by Heirloom Doctools
               troff 050915 (September 2005), mandoc 1.13.1
               (2014‐08‐10), and neatroff (commit 399a4936, 2014‐02‐17),
               but not by Plan 9, Solaris, or Documenter’s Workbench
               troffs.

> Use \(en (en-dash) for a dash at the beginning (end) of a line,
> or between space characters,

`\(en` is not portable to AT&T troff.  We don't mention it in
groff_man_style(7).  See groff_char(7) for a comprehensive reference
annotating portable special character names.  They're marked with `+` in
the "Notes" field of the glyph tables.

> The name of a man page is typeset in bold

No, this is not a universal convention and it is historically
inaccurate.

https://lists.gnu.org/archive/html/groff/2023-08/msg00005.html

Projects get to make their own decisions in this area.  groff itself
uses italics, the choice most consistent with man page history.

> Add a zero (0) in front of a decimal fraction that begins with a period
> (.)
> 
> 30:.if t .sp .5
> 38:.if t .sp .5

This advice is spurious.  No man page formatter that is used widely
enough to sustain public discussion fails to correctly lexically analyze
decimal floating point numbers that start with the decimal point.

Perhaps you offer this advice to work around limitations of the regexes
and/or sed commands you prescribe.  If that is so, the inadequacies of
your tools and advice are not the problems of man page maintainers.

> Remove quotes when there is a printable
> but no space character between them
> and the quotes are not for emphasis (markup),

Emphasis is not the same thing as markup.  Markup is a means of
instructing the formatter.  Emphasis is, in this context, a visible
alteration to the typeface, distinguishing it from adjacent words.

> for example as an argument to a macro.
> 
> twm.1:125:.B "$HOME/.twmrc.\fIscreennumber\fP"
> twm.1:132:.B "$HOME/.twmrc"
> twm.1:216:.IP "\fBAutoRelativeResize\fP" 8
[...]

This isn't sound advice anyway.  Using quotation marks consistently with
text arguments might be easier for a document maintainer, since if they
change (or copy and paste) a macro call, they don't need to remember to
add or delete bracketing quotation marks depending on the word count of
the argument.

Quoting multi-word macro arguments leads to slightly more efficient
processing by the formatter, but as far as I know no one has ever
quantitatively measured the difference in the modern era (say, since the
year 2000), when sub-second or even sub-tenth-second man page rendering
times are typical.

> Section headings (.SH and .SS) do not need quoting their arguments.
> 
> 1268:.SH "ENVIRONMENT VARIABLES"
> 1275:.SH "SEE ALSO"

See the previous item.

> Output from "test-groff  -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=10 -ww 
> -z ":
> 
> troff:<stdin>:377: warning: trailing space in the line
> troff:<stdin>:646: warning: trailing space in the line
> troff:<stdin>:731: warning: trailing space in the line
> troff:<stdin>:761: warning: trailing space in the line

Didn't you already report this in the same message?

> --- twm.1     2025-03-29 14:18:05.443744572 +0000
> +++ twm.1.new 2025-03-29 15:44:42.231882480 +0000
> @@ -1,3 +1,4 @@
> +'\" t

You didn't motivate this change; you should have.

>   Any program (person), that produces man pages, should check the output
> for defects by using (both groff and nroff)
> 
> [gn]roff -mandoc -t -ww -b -z -K utf8 <man page>

Advising the "-K utf8" option is inappropriate for pages encoded using
ISO Latin-1, which all groff releases support.  You could be helpful in
advising people to transition away from Latin-1 to US-ASCII, which would
in turn make a future (GNU) groff release that supports UTF-8 input by
default a less disruptive transition.[1]  But if you offer such advice,
you should tell people clearly _why_ you are doing so.

Use of non-US-ASCII code points inherently rules out a man page's
portability to the legacy AT&T troff systems described above.[2]  Any
printable non-US-ASCII code points are better represented as special
character escape sequences.  groff_char(7) offers a convenient table for
Latin-1 users, but all Latin scripts in common use are readily
supported.  See subsections "Accents" and "Accented characters" of
groff_char(7).[3]

Regards,
Branden

[0] https://man7.org/linux/man-pages/man7/groff_man_style.7.html

[1] It seems likely right now that a transition period involving at
    least one groff release would occur where UTF-8 input is supported,
    but _not_ as the default.  Implementing such support demands a major
    refactor of GNU troff internals, which presume an 8-bit integral
    representation of character codes (and hyphenation codes, internal
    elements of groff's bespoke `string` class, and so forth).

[2] See the last few paragraphs of
    <https://lists.gnu.org/archive/html/groff/2025-03/msg00073.html>.

[3] https://man7.org/linux/man-pages/man7/groff_char.7.html

Attachment: signature.asc
Description: PGP signature

Reply via email to