[CC -= Ingo, as requested] Hi Bjarni,
On 4/27/23 03:40, Bjarni Ingi Gislason wrote: > > "groff" is not the right tool for such things, but "grep" is. It could work for an initial implementation. It would only have some false positives for things like defining your own macros at the top of a page, but that's not something I do, so I could live with it. > > The attachment contains a shell script that tests various cases of > defects in man pages. > > It can test for just one or few cases or all of them. > > For example create a file with > > foo. bar > foo. bar > foo. Bar > foo. Bar > > or more examples > > and run > > <name of script> all <file> $ ./semantic_newlines all ./foo.man checking test case nr 7 ./semantic_newlines: line 803: 3*/4: syntax error: operand expected (error token is "/4") ./semantic_newlines: line 1450: mjd: command not found ./semantic_newlines: line 1470: /semantic_newlines.all.diff.new..112419: Permission denied sed: couldn't open file /home/alx/bin/groff.comment.sed: No such file or directory sed: couldn't open file /home/alx/bin/groff.comment.sed: No such file or directory Input file is ./foo.man, case 1 I guess this script has dependencies that I don't know about? $ apt-file find bin/mjd $ > > Later you can use the reported test numbers to just run those tests. > > The script can (still) produce a lot of wrong positive results. > Regarding the script itself (and its documentation), here goes some review: > #!/bin/bash > # Input > # 1) one number, one or more files > # 2) "all", one or more files What's the meaning of the number? I'd appreciate a man-page-like documentation, hopefully available via -h. See what I mean: <https://github.com/nginx/unit/blob/1a485fed6a8353ecc09e6c0f050e44c0a2d30419/tools/setup-unit#L827> > > # In $SEDLIB: "groff.comment.sed", "groff.TH.sed", "groff.hyphen-minus.sed", > # "check_manuals", "strings_gt" > # > # "chk_manuals" uses: "in_out_put.sh", "mandoc", "groff.lint", and > # "roff.singleword.sed" > > # Environmental variable: MANWIDTH (see man(1)) with 'm' unit > # > # Instead of "test-groff" (in the git repository) I don't understand the above (expect for MANWIDTH). > > *) if test "$type" -gt $total; then > echo "$Cmd_name: test number \"$type\" is greater than defined > \($total\)" >&2 > exit 1 > else > diff_file=${TMPDIR}/type.${type}.diff.new.$time > prof_listi="$type" > fi else is unnecessary after exit. Conditional code is harder to read, since the brain needs to remember one more thing, so I suggest making that code unconditional. > > if eval eval \""\${command[${prof}]}"\" "$input" > "$tempfile" ; then > # eval printf "'%s\n\n'" \"Test nr. ${prof}: \${do_what[${prof}]}\" > # if test "$old_filename" != "$file" ; then > if test "$filename_printed" = 'no'; then > printf '%s\n' "Input file is $input, case 1" > filename_printed=yes > fi > if test "$print_test_nr" = 'yes'; then > printf '%s\n' "Test nr. ${prof}:" > fi > eval printf "'\n%s\n\n' \"\${patch_explain[${prof}]}\"" > eval printf "'%s\n\n' \"\${do_what[${prof}]}\"" >> > "$TMPDIR/${file##*/}".summary > case "$1" in > file) : > ;; > test) > if [ -s "$tempfile" ] ; then > : > # printf '####\n\n%s\n\n' "Input file is $file, case 2" > # printf '%s\n' '#### evaluate: case "test"' > fi > ;; > *) echo "$Cmd_name"': function evaluate: case "'"$1"'" is missing' >&2 > exit 1 > ;; > esac > # eval printf "'%s\n\n'" \"\${patch_explain[${prof}]}\"\ > # | tee -a "$diff_file" > cat "$tempfile" > printf '\n#####\n\n' > return 0 > else At this point, I already forgot what the condition was. Cache misses hurt in meatware too. When writing 2-branch conditionals, prefer writing the short branch first. It will help human readers avoid cache misses. Hopefully, it will also help the CPU avoid similar problems. Also, for when both branches are similarly long, putting the exceptional condition in the first branch usually results in faster and more readable code, since the else branch usually loads less instructions[1]. > return 1 > fi On top of that, this branch has a return, so when reversed, the other branch wouldn't even need to be within an else, which reduces the complexity even more. [1]: if (A == exceptional_value) f(); else g(); translates into load A; if not exceptional_value, jump to else; f(); jump to done; else: g(); done: This kind of micro-optimization is probably too much to be considered sane, but I find that it also optimizes for human brains reading the code (at least for mine). Cheers! Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
OpenPGP_signature
Description: OpenPGP digital signature