Re: Warn on semantic newlines

Bjarni Ingi Gislason Thu, 27 Apr 2023 17:25:58 -0700

On Thu, Apr 27, 2023 at 11:49:07AM +0200, Alejandro Colomar wrote:
> [CC -= Ingo, as requested]
> 
> Hi Bjarni,
> 
> On 4/27/23 03:40, Bjarni Ingi Gislason wrote:
> > 
> >   "groff" is not the right tool for such things, but "grep" is.
> 
> It could work for an initial implementation.  It would only have
> some false positives for things like defining your own macros at
> the top of a page, but that's not something I do, so I could live
> with it.
> 
> > 
> >   The attachment contains a shell script that tests various cases of
> > defects in man pages.
> > 
> >   It can test for just one or few cases or all of them.
> >


  It is just for one case (testing all files of concern for the same
issue) or all tests for one file (or more).

> >   For example create a file with
> > 
> > foo. bar
> > foo.  bar
> > foo.  Bar
> > foo. Bar
> > 
> >   or more examples
> > 
> > and run 
> > 
> > <name of script> all <file>
> 
> $ ./semantic_newlines all ./foo.man 
> checking test case nr 7
> ./semantic_newlines: line 803: 3*/4: syntax error: operand expected (error 
> token is "/4")
> ./semantic_newlines: line 1450: mjd: command not found
> ./semantic_newlines: line 1470: /semantic_newlines.all.diff.new..112419: 
> Permission denied
> sed: couldn't open file /home/alx/bin/groff.comment.sed: No such file or 
> directory
> sed: couldn't open file /home/alx/bin/groff.comment.sed: No such file or 
> directory
> Input file is ./foo.man, case 1
> 
> 
> I guess this script has dependencies that I don't know about?
> 

  I forgot to add the two missing files, "groff.comment.sed" and
"stings_gt".
They are in the attachment.

> $ apt-file find bin/mjd
> $ 
> 
> > 
> >   Later you can use the reported test numbers to just run those tests.
> > 
> >   The script can (still) produce a lot of wrong positive results.
> > 
> 
> Regarding the script itself (and its documentation), here goes some
> review:
> 
> 

  Thanks, the script has not been "cleaned" and documentation (if
any) is only in the script.

  This was an evolving task; tests added (or corrected) (with the
prototype near the end of the script) as issues were recognised.

> > #!/bin/bash
> > # Input
> > # 1) one number, one or more files
> > # 2) "all", one or more files
> 
> What's the meaning of the number?  I'd appreciate a man-page-like
> documentation, hopefully available via -h.  See what I mean:
> 
> <https://github.com/nginx/unit/blob/1a485fed6a8353ecc09e6c0f050e44c0a2d30419/tools/setup-unit#L827>
> 

  The script was only for myself, and '-h' is missing, as only
one argument is needed, and if missing an reminder is shown.

> > 
> > # In $SEDLIB: "groff.comment.sed", "groff.TH.sed", "groff.hyphen-minus.sed",
> > # "check_manuals", "strings_gt"
> > #
> > # "chk_manuals" uses: "in_out_put.sh", "mandoc", "groff.lint", and
> > # "roff.singleword.sed" 
> > 
> > # Environmental variable: MANWIDTH (see man(1)) with 'm' unit
> > #
> > # Instead of "test-groff" (in the git repository)
> 
> I don't understand the above (expect for MANWIDTH).
> 

  These are related to another scripts.

> > 
> > *) if test "$type" -gt $total; then
> >      echo "$Cmd_name: test number \"$type\" is greater than defined 
> > \($total\)" >&2
> >      exit 1
> >    else
> >      diff_file=${TMPDIR}/type.${type}.diff.new.$time
> >      prof_listi="$type"
> >    fi
> 
> else is unnecessary after exit.  Conditional code is harder to read,
> since the brain needs to remember one more thing, so I suggest making
> that code unconditional.
> 
> > 
> >     if eval eval \""\${command[${prof}]}"\" "$input" > "$tempfile"  ; then
> > #      eval printf "'%s\n\n'" \"Test nr. ${prof}: \${do_what[${prof}]}\"
> > #      if test  "$old_filename" != "$file" ; then
> >       if test "$filename_printed" = 'no'; then
> >         printf '%s\n' "Input file is $input, case 1"
> >         filename_printed=yes
> >       fi
> >       if test "$print_test_nr" = 'yes'; then
> >         printf '%s\n' "Test nr. ${prof}:"
> >       fi
> >       eval printf "'\n%s\n\n' \"\${patch_explain[${prof}]}\""
> >       eval printf "'%s\n\n' \"\${do_what[${prof}]}\""  >> 
> > "$TMPDIR/${file##*/}".summary
> >       case "$1" in
> >         file) :
> >       ;;
> >         test)
> >           if [ -s "$tempfile" ] ; then
> >         :
> > #            printf '####\n\n%s\n\n' "Input file is $file, case 2"
> > #             printf '%s\n' '#### evaluate: case "test"'
> >           fi
> >       ;;
> >         *) echo "$Cmd_name"': function evaluate: case "'"$1"'" is missing' 
> > >&2
> >            exit 1
> >       ;;
> >       esac
> > #      eval printf "'%s\n\n'" \"\${patch_explain[${prof}]}\"\
> > #        | tee -a  "$diff_file"
> >       cat "$tempfile"
> >       printf '\n#####\n\n'
> >       return 0
> >     else
> 
> At this point, I already forgot what the condition was.  Cache misses
> hurt in meatware too.  When writing 2-branch conditionals, prefer
> writing the short branch first.  It will help human readers avoid
> cache misses.  Hopefully, it will also help the CPU avoid similar
> problems.
> 
> Also, for when both branches are similarly long, putting the
> exceptional condition in the first branch usually results in faster
> and more readable code, since the else branch usually loads less
> instructions[1].
> 
> >       return 1
> >     fi
> 
> On top of that, this branch has a return, so when reversed, the
> other branch wouldn't even need to be within an else, which
> reduces the complexity even more.
> 
> 
> 
> [1]:  
> 
> if (A == exceptional_value)
>       f();
> else
>       g();
> 
> 
> translates into
> 
>       load A;
>       if not exceptional_value, jump to else;
> 
>       f();
>       jump to done;
> else:
>       g();
> done:
> 
> 
> 
> This kind of micro-optimization is probably too much to be considered
> sane, but I find that it also optimizes for human brains reading the
> code (at least for mine).
> 
> Cheers!
> Alex
> 
> 
> -- 
> <http://www.alejandro-colomar.es/>
> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

s/^\([.'][\]["]\).*$/\1/

#!/bin/sh
set -efu

Cmd_name=${0##*/}
#echo $Cmd_name; exit 1
#LIB=$HOME/bin
#. $LIB/in_out_put.sh
status=0
limit=$1
shift

#temp_file=$(mktemp -t ${Cmd_name}.XXXXXX)
temp_out=$(mktemp -t ${Cmd_name}.XXXXXX)

#cat ${1:--} | tr -s '  ' '\n' > $temp_file

awk \
'BEGIN { ref_length = '"$limit"' }
{for (i = 1; i <= NF; i++) 
if (length($i) > ref_length) print NR, $0
}' ${1:--} > $temp_out

if test -s $temp_out ; then
  cat $temp_out
  status=0
else
  status=1
fi

rm $temp_out

exit $status

Re: Warn on semantic newlines

Reply via email to