On Thu, Apr 27, 2023 at 11:49:07AM +0200, Alejandro Colomar wrote: > [CC -= Ingo, as requested] > > Hi Bjarni, > > On 4/27/23 03:40, Bjarni Ingi Gislason wrote: > > > > "groff" is not the right tool for such things, but "grep" is. > > It could work for an initial implementation. It would only have > some false positives for things like defining your own macros at > the top of a page, but that's not something I do, so I could live > with it. > > > > > The attachment contains a shell script that tests various cases of > > defects in man pages. > > > > It can test for just one or few cases or all of them. > >
It is just for one case (testing all files of concern for the same issue) or all tests for one file (or more). > > For example create a file with > > > > foo. bar > > foo. bar > > foo. Bar > > foo. Bar > > > > or more examples > > > > and run > > > > <name of script> all <file> > > $ ./semantic_newlines all ./foo.man > checking test case nr 7 > ./semantic_newlines: line 803: 3*/4: syntax error: operand expected (error > token is "/4") > ./semantic_newlines: line 1450: mjd: command not found > ./semantic_newlines: line 1470: /semantic_newlines.all.diff.new..112419: > Permission denied > sed: couldn't open file /home/alx/bin/groff.comment.sed: No such file or > directory > sed: couldn't open file /home/alx/bin/groff.comment.sed: No such file or > directory > Input file is ./foo.man, case 1 > > > I guess this script has dependencies that I don't know about? > I forgot to add the two missing files, "groff.comment.sed" and "stings_gt". They are in the attachment. > $ apt-file find bin/mjd > $ > > > > > Later you can use the reported test numbers to just run those tests. > > > > The script can (still) produce a lot of wrong positive results. > > > > Regarding the script itself (and its documentation), here goes some > review: > > Thanks, the script has not been "cleaned" and documentation (if any) is only in the script. This was an evolving task; tests added (or corrected) (with the prototype near the end of the script) as issues were recognised. > > #!/bin/bash > > # Input > > # 1) one number, one or more files > > # 2) "all", one or more files > > What's the meaning of the number? I'd appreciate a man-page-like > documentation, hopefully available via -h. See what I mean: > > <https://github.com/nginx/unit/blob/1a485fed6a8353ecc09e6c0f050e44c0a2d30419/tools/setup-unit#L827> > The script was only for myself, and '-h' is missing, as only one argument is needed, and if missing an reminder is shown. > > > > # In $SEDLIB: "groff.comment.sed", "groff.TH.sed", "groff.hyphen-minus.sed", > > # "check_manuals", "strings_gt" > > # > > # "chk_manuals" uses: "in_out_put.sh", "mandoc", "groff.lint", and > > # "roff.singleword.sed" > > > > # Environmental variable: MANWIDTH (see man(1)) with 'm' unit > > # > > # Instead of "test-groff" (in the git repository) > > I don't understand the above (expect for MANWIDTH). > These are related to another scripts. > > > > *) if test "$type" -gt $total; then > > echo "$Cmd_name: test number \"$type\" is greater than defined > > \($total\)" >&2 > > exit 1 > > else > > diff_file=${TMPDIR}/type.${type}.diff.new.$time > > prof_listi="$type" > > fi > > else is unnecessary after exit. Conditional code is harder to read, > since the brain needs to remember one more thing, so I suggest making > that code unconditional. > > > > > if eval eval \""\${command[${prof}]}"\" "$input" > "$tempfile" ; then > > # eval printf "'%s\n\n'" \"Test nr. ${prof}: \${do_what[${prof}]}\" > > # if test "$old_filename" != "$file" ; then > > if test "$filename_printed" = 'no'; then > > printf '%s\n' "Input file is $input, case 1" > > filename_printed=yes > > fi > > if test "$print_test_nr" = 'yes'; then > > printf '%s\n' "Test nr. ${prof}:" > > fi > > eval printf "'\n%s\n\n' \"\${patch_explain[${prof}]}\"" > > eval printf "'%s\n\n' \"\${do_what[${prof}]}\"" >> > > "$TMPDIR/${file##*/}".summary > > case "$1" in > > file) : > > ;; > > test) > > if [ -s "$tempfile" ] ; then > > : > > # printf '####\n\n%s\n\n' "Input file is $file, case 2" > > # printf '%s\n' '#### evaluate: case "test"' > > fi > > ;; > > *) echo "$Cmd_name"': function evaluate: case "'"$1"'" is missing' > > >&2 > > exit 1 > > ;; > > esac > > # eval printf "'%s\n\n'" \"\${patch_explain[${prof}]}\"\ > > # | tee -a "$diff_file" > > cat "$tempfile" > > printf '\n#####\n\n' > > return 0 > > else > > At this point, I already forgot what the condition was. Cache misses > hurt in meatware too. When writing 2-branch conditionals, prefer > writing the short branch first. It will help human readers avoid > cache misses. Hopefully, it will also help the CPU avoid similar > problems. > > Also, for when both branches are similarly long, putting the > exceptional condition in the first branch usually results in faster > and more readable code, since the else branch usually loads less > instructions[1]. > > > return 1 > > fi > > On top of that, this branch has a return, so when reversed, the > other branch wouldn't even need to be within an else, which > reduces the complexity even more. > > > > [1]: > > if (A == exceptional_value) > f(); > else > g(); > > > translates into > > load A; > if not exceptional_value, jump to else; > > f(); > jump to done; > else: > g(); > done: > > > > This kind of micro-optimization is probably too much to be considered > sane, but I find that it also optimizes for human brains reading the > code (at least for mine). > > Cheers! > Alex > > > -- > <http://www.alejandro-colomar.es/> > GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
s/^\([.'][\]["]\).*$/\1/
#!/bin/sh set -efu Cmd_name=${0##*/} #echo $Cmd_name; exit 1 #LIB=$HOME/bin #. $LIB/in_out_put.sh status=0 limit=$1 shift #temp_file=$(mktemp -t ${Cmd_name}.XXXXXX) temp_out=$(mktemp -t ${Cmd_name}.XXXXXX) #cat ${1:--} | tr -s ' ' '\n' > $temp_file awk \ 'BEGIN { ref_length = '"$limit"' } {for (i = 1; i <= NF; i++) if (length($i) > ref_length) print NR, $0 }' ${1:--} > $temp_out if test -s $temp_out ; then cat $temp_out status=0 else status=1 fi rm $temp_out exit $status