Package: icu-devtools
Version: 72.1-6
Severity: minor
Tags: patch

   * What led up to the situation?

     Checking for defects with a new version

test-[g|n]roff -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=10 -ww -z < "man 
page"

  [Use "groff -e ' $' -e '\\~$' <file>" to find obvious trailing spaces.]

  ["test-groff" is a script in the repository for "groff"; is not shipped]
(local copy and "troff" slightly changed by me).

  [The fate of "test-nroff" was decided in groff bug #55941.]

   * What was the outcome of this action?

an.tmac:<stdin>:16: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:18: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:22: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:25: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:28: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:31: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:36: misuse, warning: .IR is for at least 2 arguments, got 1
        Use macro '.I' for one argument or split argument.
an.tmac:<stdin>:37: misuse, warning: .IR is for at least 2 arguments, got 1
        Use macro '.I' for one argument or split argument.
troff:<stdin>:42: warning: trailing space in the line
an.tmac:<stdin>:50: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:53: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:58: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:62: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:75: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:77: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:79: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
troff:<stdin>:80: warning: trailing space in the line
an.tmac:<stdin>:81: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:83: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:85: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:87: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
troff:<stdin>:89: warning: trailing space in the line
troff:<stdin>:90: warning: trailing space in the line
troff:<stdin>:92: warning: trailing space in the line
troff:<stdin>:93: warning: trailing space in the line
an.tmac:<stdin>:94: misuse, warning: .IR is for at least 2 arguments, got 1
        Use macro '.I' for one argument or split argument.
an.tmac:<stdin>:97: misuse, warning: .BI is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:100: misuse, warning: .BI is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
troff:<stdin>:103: warning: trailing space in the line
an.tmac:<stdin>:104: misuse, warning: .IR is for at least 2 arguments, got 1
        Use macro '.I' for one argument or split argument.
troff:<stdin>:106: warning: trailing space in the line
an.tmac:<stdin>:107: misuse, warning: .IR is for at least 2 arguments, got 1
        Use macro '.I' for one argument or split argument.
troff:<stdin>:108: warning: trailing space in the line
troff:<stdin>:109: warning: trailing space in the line
an.tmac:<stdin>:112: misuse, warning: .BI is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
troff:<stdin>:113: warning: trailing space in the line
an.tmac:<stdin>:114: misuse, warning: .BI is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:132: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.


   * What outcome did you expect instead?

     No output (no warnings).

-.-

  General remarks and further material, if a diff-file exist, are in the
attachments.


-- System Information:
Debian Release: trixie/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 6.12.17-amd64 (SMP w/2 CPU threads; PREEMPT)
Locale: LANG=is_IS.iso88591, LC_CTYPE=is_IS.iso88591 (charmap=ISO-8859-1), 
LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: sysvinit (via /sbin/init)

Versions of packages icu-devtools depends on:
ii  libc6       2.40-7
ii  libgcc-s1   14.2.0-17
ii  libicu72    72.1-6
ii  libstdc++6  14.2.0-17

icu-devtools recommends no packages.

icu-devtools suggests no packages.

-- no debconf information
Input file is gendict.1

Output from "mandoc -T lint  gendict.1": (shortened list)

      1 input text line longer than 80 bytes: Words begin at the b...
     12 whitespace at end of input line

Remove trailing space with: sed -e 's/  *$//'

-.-.

Output from "test-nroff -mandoc -t -ww -z gendict.1": (shortened list)

     22         Use macro '.B' for one argument or split argument.
      5         Use macro '.I' for one argument or split argument.
      4 .BI is for at least 2 arguments, got 1
     18 .BR is for at least 2 arguments, got 1
      5 .IR is for at least 2 arguments, got 1
     11 trailing space in the line

Remove trailing space with: sed -e 's/  *$//'

-.-.

Remove space characters (whitespace) at the end of lines.
Use "git apply ... --whitespace=fix" to fix extra space issues, or use
global configuration "core.whitespace".

Number of lines affected is

12

-.-.

Change two HYPHEN-MINUSES (code 0x2D) to an em-dash (\(em),
if one is intended.
  " \(em " creates a too big gap in the text (in "troff").

An en-dash is usually surrounded by a space,
while an em-dash is used without spaces.
"man" (1 byte characters in input) transforms an en-dash (\(en) to one
HYPHEN-MINUS,
and an em-dash to two HYPHEN-MINUSES without considering the space
around it.
If "--" are two single "-"
(begin of an option or end of options)
then use "\-\-".

gendict.1:77:.BR --bytes.
gendict.1:81:.BR --uchars.
gendict.1:85:.BR --bytes.
gendict.1:112:.BI --bytes
gendict.1:114:.BI --uchars

-.-.

Use the correct macro for the font change of a single argument or
split the argument into two.

16:.BR "\fB\-\-uchars"
18:.BR "\fB\-\-bytes"
22:.BR "\-h\fP, \fB\-?\fP, \fB\-\-help"
25:.BR "\-V\fP, \fB\-\-version"
28:.BR "\-c\fP, \fB\-\-copyright"
31:.BR "\-v\fP, \fB\-\-verbose"
36:.IR " input-file"
37:.IR " output\-file"
50:.BR "\-h\fP, \fB\-?\fP, \fB\-\-help"
53:.BR "\-V\fP, \fB\-\-version"
58:.BR "\-c\fP, \fB\-\-copyright"
62:.BR "\-v\fP, \fB\-\-verbose"
75:.BR "\fB\-\-uchars"
79:.BR "\fB\-\-bytes"
83:.BR "\fB\-\-transform"
97:.BI " input\-file"
100:.BI " output\-file"
107:.IR input-file 

-.-.

Wrong distance (not two spaces) between sentences in the input file.

  Separate the sentences and subordinate clauses; each begins on a new
line.  See man-pages(7) ("Conventions for source file layout") and
"info groff" ("Input Conventions").

  The best procedure is to always start a new sentence on a new line,
at least, if you are typing on a computer.

Remember coding: Only one command ("sentence") on each (logical) line.

E-mail: Easier to quote exactly the relevant lines.

Generally: Easier to edit the sentence.

Patches: Less unaffected text.

Search for two adjacent words is easier, when they belong to the same line,
and the same phrase.

  The amount of space between sentences in the output can then be
controlled with the ".ss" request.

Mark a final abbreviation point as such by suffixing it with "\&".

Some sentences (etc.) do not begin on a new line.

Split (sometimes) lines after a punctuation mark; before a conjunction.

  Lines with only one (or two) space(s) between sentences could be split,
so latter sentences begin on a new line.

Use

#!/usr/bin/sh

sed -e '/^\./n' \
-e 's/\([[:alpha:]]\)\.  */\1.\n/g' $1

to split lines after a sentence period.
Check result with the difference between the formatted outputs.
See also the attachment "general.bugs"

[List of affected lines removed.]

42:and creates a string trie dictionary file. Normally this data file has the 
76:Set the output trie type to UChar. Mutually exclusive with
80:Set the output trie type to Bytes. Mutually exclusive with 
84:Set the transform type. Should only be specified with
108:that are used as values must be made up of ASCII digits. They 
119:Specifies the directory containing ICU data. Defaults to
121:Some tools in ICU depend on the presence of the trailing slash. It is thus

-.-.

Split lines longer than 80 characters into two or more lines.
Appropriate break points are the end of a sentence and a subordinate
clause; after punctuation marks.
Add "\:" to split the string for the output, "\<newline>" in the source.  

Line 46, length 82

Words begin at the beginning of a line and are terminated by the first 
whitespace.

-.-.

Split a punctuation mark from a single argument for a two-font macro

77:.BR --bytes.
81:.BR --uchars.
85:.BR --bytes.
87:.BR offset-<hex-number>,

-.-.

One space only after a possible end of sentence
(after a punctuation, that
can end a sentence).

gendict.1:42:and creates a string trie dictionary file. Normally this data file 
has the 
gendict.1:76:Set the output trie type to UChar. Mutually exclusive with
gendict.1:80:Set the output trie type to Bytes. Mutually exclusive with 
gendict.1:84:Set the transform type. Should only be specified with
gendict.1:108:that are used as values must be made up of ASCII digits. They 
gendict.1:119:Specifies the directory containing ICU data. Defaults to
gendict.1:121:Some tools in ICU depend on the presence of the trailing slash. 
It is thus

-.-.

Put a subordinate sentence (after a comma) on a new line.

gendict.1:68:For example, the file
gendict.1:90:to 0xFF and U+200C to 0xFE, in order to offer compatibility to 
gendict.1:92:A transform must be specified for a bytes trie, and when applied 
gendict.1:109:may be specified either in hex, by using a 0x prefix, or in 

-.-.

Remove quotes when there is a printable
but no space character between them
and the quotes are not for emphasis (markup),
for example as an argument to a macro.

gendict.1:16:.BR "\fB\-\-uchars"
gendict.1:18:.BR "\fB\-\-bytes"
gendict.1:19:.BI "\fB\-\-transform" " transform"
gendict.1:75:.BR "\fB\-\-uchars"
gendict.1:79:.BR "\fB\-\-bytes"
gendict.1:83:.BR "\fB\-\-transform"

-.-.

Trailing space in a macro call.
Remove with "sed -i -e 's/  *$//'"

107:.IR input-file 

-.-.

Output from "test-groff  -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=10 -ww -z 
":

an.tmac:<stdin>:16: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:18: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:22: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:25: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:28: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:31: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:36: misuse, warning: .IR is for at least 2 arguments, got 1
        Use macro '.I' for one argument or split argument.
an.tmac:<stdin>:37: misuse, warning: .IR is for at least 2 arguments, got 1
        Use macro '.I' for one argument or split argument.
troff:<stdin>:42: warning: trailing space in the line
an.tmac:<stdin>:50: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:53: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:58: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:62: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:75: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:77: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:79: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
troff:<stdin>:80: warning: trailing space in the line
an.tmac:<stdin>:81: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:83: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:85: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:87: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
troff:<stdin>:89: warning: trailing space in the line
troff:<stdin>:90: warning: trailing space in the line
troff:<stdin>:92: warning: trailing space in the line
troff:<stdin>:93: warning: trailing space in the line
an.tmac:<stdin>:94: misuse, warning: .IR is for at least 2 arguments, got 1
        Use macro '.I' for one argument or split argument.
an.tmac:<stdin>:97: misuse, warning: .BI is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:100: misuse, warning: .BI is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
troff:<stdin>:103: warning: trailing space in the line
an.tmac:<stdin>:104: misuse, warning: .IR is for at least 2 arguments, got 1
        Use macro '.I' for one argument or split argument.
troff:<stdin>:106: warning: trailing space in the line
an.tmac:<stdin>:107: misuse, warning: .IR is for at least 2 arguments, got 1
        Use macro '.I' for one argument or split argument.
troff:<stdin>:108: warning: trailing space in the line
troff:<stdin>:109: warning: trailing space in the line
an.tmac:<stdin>:112: misuse, warning: .BI is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
troff:<stdin>:113: warning: trailing space in the line
an.tmac:<stdin>:114: misuse, warning: .BI is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.
an.tmac:<stdin>:132: misuse, warning: .BR is for at least 2 arguments, got 1
        Use macro '.B' for one argument or split argument.

-.-.

Generally:

Split (sometimes) lines after a punctuation mark; before a conjunction.
--- gendict.1   2025-03-12 17:17:08.479919304 +0000
+++ gendict.1.new       2025-03-12 18:00:04.258704179 +0000
@@ -13,113 +13,125 @@
 .SH SYNOPSIS
 .B gendict
 [
-.BR "\fB\-\-uchars"
+.B \-\-uchars
 |
-.BR "\fB\-\-bytes"
-.BI "\fB\-\-transform" " transform"
+.B \-\-bytes
+.BI \-\-transform " transform"
 ]
 [
-.BR "\-h\fP, \fB\-?\fP, \fB\-\-help"
+.BR \-h ", " \-? ", " \-\-help
 ]
 [
-.BR "\-V\fP, \fB\-\-version"
+.BR \-V ", " \-\-version
 ]
 [
-.BR "\-c\fP, \fB\-\-copyright"
+.BR \-c ", " \-\-copyright
 ]
 [
-.BR "\-v\fP, \fB\-\-verbose"
+.BR \-v ", " \-\-verbose
 ]
 [
-.BI "\-i\fP, \fB\-\-icudatadir" " directory"
+.BR \-i ", " \-\-icudatadir " \fIdirectory\fP"
 ]
-.IR " input-file"
-.IR " output\-file"
+.I input-file
+.I output\-file
 .SH DESCRIPTION
 .B gendict
 reads the word list from
 .I dictionary-file
-and creates a string trie dictionary file. Normally this data file has the 
+and creates a string trie dictionary file.
+Normally this data file has the
 .B .dict
 extension.
 .PP
-Words begin at the beginning of a line and are terminated by the first 
whitespace.
+Words begin at the beginning of a line
+and are terminated by the first whitespace.
 Lines that begin with whitespace are ignored.
 .SH OPTIONS
 .TP
-.BR "\-h\fP, \fB\-?\fP, \fB\-\-help"
+.BR \-h ", " \-? ", " \-\-help
 Print help about usage and exit.
 .TP
-.BR "\-V\fP, \fB\-\-version"
+.BR \-V ", " \-\-version
 Print the version of
 .B gendict
 and exit.
 .TP
-.BR "\-c\fP, \fB\-\-copyright"
+.BR \-c ", " \-\-copyright
 Embeds the standard ICU copyright into the
 .IR output-file .
 .TP
-.BR "\-v\fP, \fB\-\-verbose"
+.BR \-v ", " \-\-verbose
 Display extra informative messages during execution.
 .TP
-.BI "\-i\fP, \fB\-\-icudatadir" " directory"
+.BR \-i ", " \-\-icudatadir " \fIdirectory\fP"
 Look for any necessary ICU data files in
 .IR directory .
-For example, the file
+For example,
+the file
 .B pnames.icu
 must be located when ICU's data is not built as a shared library.
 The default ICU data directory is specified by the environment variable
 .BR ICU_DATA .
 Most configurations of ICU do not require this argument.
 .TP
-.BR "\fB\-\-uchars"
-Set the output trie type to UChar. Mutually exclusive with
-.BR --bytes.
-.TP
-.BR "\fB\-\-bytes"
-Set the output trie type to Bytes. Mutually exclusive with 
-.BR --uchars.
-.TP
-.BR "\fB\-\-transform"
-Set the transform type. Should only be specified with
-.BR --bytes.
+.B \-\-uchars
+Set the output trie type to UChar.
+Mutually exclusive with
+.BR \-\-bytes .
+.TP
+.B \-\-bytes
+Set the output trie type to Bytes.
+Mutually exclusive with
+.BR \-\-uchars .
+.TP
+.BI \-\-transform " transform"
+Set the transform type.
+Should only be specified with
+.BR \-\-bytes .
 Currently supported transforms are:
-.BR offset-<hex-number>,
+.BR offset\-<hex-number> ,
 which specifies an offset to subtract from all input characters.
-It should be noted that the offset transform also maps U+200D 
-to 0xFF and U+200C to 0xFE, in order to offer compatibility to 
-languages that require these characters.
-A transform must be specified for a bytes trie, and when applied 
-to the non-value characters in the 
-.IR input-file
+It should be noted
+that the offset transform also maps U+200D to 0xFF
+and U+200C to 0xFE,
+in order to offer compatibility to languages
+that require these characters.
+A transform must be specified for a bytes trie,
+and when applied to the non-value characters in the
+.I input-file
 must produce output between 0x00 and 0xFF.
 .TP
-.BI " input\-file"
+.I input\-file
 The source file to read.
 .TP
-.BI " output\-file"
+.I output\-file
 The file to write the output dictionary to.
 .SH CAVEATS
-The 
-.IR input-file
+The
+.I input-file
 is assumed to be encoded in UTF-8.
-The integers in the 
-.IR input-file 
-that are used as values must be made up of ASCII digits. They 
-may be specified either in hex, by using a 0x prefix, or in 
-decimal.
+The integers in the
+.I input-file
+that are used as values
+must be made up of ASCII digits.
+They may be specified either in hex,
+by using a 0x prefix,
+or in decimal.
 Either
-.BI --bytes
-or 
-.BI --uchars
+.B \-\-bytes
+or
+.B \-\-uchars
 must be specified.
 .SH ENVIRONMENT
 .TP 10
 .B ICU_DATA
-Specifies the directory containing ICU data. Defaults to
+Specifies the directory containing ICU data.
+Defaults to
 .BR ${prefix}/share/icu/72.1/ .
-Some tools in ICU depend on the presence of the trailing slash. It is thus
-important to make sure that it is present if
+Some tools in ICU depend on the presence of the trailing slash.
+It is thus important to make sure
+that it is present if
 .B ICU_DATA
 is set.
 .SH AUTHORS
@@ -129,5 +141,4 @@ Maxime Serrano
 .SH COPYRIGHT
 Copyright (C) 2012 International Business Machines Corporation and others
 .SH SEE ALSO
-.BR http://www.icu-project.org/userguide/boundaryAnalysis.html
-
+.B http://www.icu\-project.org/userguide/boundaryAnalysis.html
  Any program (person), that produces man pages, should check the output
for defects by using (both groff and nroff)

[gn]roff -mandoc -t -ww -b -z -K utf8 <man page>

  The same goes for man pages that are used as an input.

  For a style guide use

  mandoc -T lint

-.-

  Any "autogenerator" should check its products with the above mentioned
'groff', 'mandoc', and additionally with 'nroff ...'.

  It should also check its input files for too long (> 80) lines.

  This is just a simple quality control measure.

  The "autogenerator" may have to be corrected to get a better man page,
the source file may, and any additional file may.

  Common defects:

  Not removing trailing spaces (in in- and output).
  The reason for these trailing spaces should be found and eliminated.

  "git" has a "tool" to point out whitespace,
see for example "git-apply(1)" and git-config(1)")

  Not beginning each input sentence on a new line.
Line length and patch size should thus be reduced.

  The script "reportbug" uses 'quoted-printable' encoding when a line is
longer than 1024 characters in an 'ascii' file.

  See man-pages(7), item "semantic newline".

-.-

The difference between the formatted output of the original and patched file
can be seen with:

  nroff -mandoc <file1> > <out1>
  nroff -mandoc <file2> > <out2>
  diff -d -u <out1> <out2>

and for groff, using

\"printf '%s\n%s\n' '.kern 0' '.ss 12 0' | groff -mandoc -Z - \"

instead of 'nroff -mandoc'

  Add the option '-t', if the file contains a table.

  Read the output from 'diff -d -u ...' with 'less -R' or similar.

-.-.

  If 'man' (man-db) is used to check the manual for warnings,
the following must be set:

  The option \"-warnings=w\"

  The environmental variable:

export MAN_KEEP_STDERR=yes (or any non-empty value)

  or

  (produce only warnings):

export MANROFFOPT=\"-ww -b -z\"

export MAN_KEEP_STDERR=yes (or any non-empty value)

-.-

Reply via email to