Paul Eggert wrote: > > - POSIX violation or not? Is it valid to pass lines with missing fields > > to 'join', according to POSIX [1]? > > It should be valid, yes. POSIX 'join' defers to POSIX 'sort' for the > definition of fields, and POSIX 'sort' says missing fields should be > treated as empty.
Thanks for explaining. Also, POSIX [1] says: "Some historical implementations have been encountered where a blank line in one of the input files was considered to be the end of the file; the description in this volume of POSIX.1-2017 does not cite this as an allowable case." > >> Then, would it make sense to document it in the GNU Autoconf manual? [2] > > Sure, I installed the attached patch to the Autoconf manual. Thanks! I see that macOS 12.6, FreeBSD 14.0, and NetBSD 9.3 have the bug, whereas OpenBSD does not have it (already at least since OpenBSD 3.8, which was in 2005). Now, back to gnulib-tool. I'm committing this patch below, that rejects a broken 'join' program. It would be possible to obey a variable named JOIN, via "${JOIN-join}" instead of 'join'. But that adds complexity, and we don't have a variable named SED in gnulib-tool either. [1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/join.html 2024-01-11 Bruno Haible <br...@clisp.org> gnulib-tool: Reject broken 'join' program as seen in macOS, FreeBSD etc. Reported by Avinash Sonawane <root...@gmail.com> in <https://lists.gnu.org/archive/html/bug-gnulib/2024-01/msg00028.html>. * gnulib-tool: Move the func_gnulib_dir and func_tmpdir invocations ahead. If the 'join' program exists but does not handle missing fields, bail out. diff --git a/gnulib-tool b/gnulib-tool index b909a81f7a..9facfd2be7 100755 --- a/gnulib-tool +++ b/gnulib-tool @@ -894,15 +894,6 @@ func_hardlink () } } -# The 'join' program does not exist on all platforms. Where it exists, -# we can use it. Where not, bail out. -if (type join) >/dev/null 2>&1; then - : -else - echo "$progname: 'join' program not found. Consider installing GNU coreutils." >&2 - func_exit 1 -fi - # Ensure an 'echo' command that # 1. does not interpret backslashes and # 2. does not print an error message "broken pipe" when writing into a pipe @@ -1071,6 +1062,38 @@ if test "X$1" = "X--no-reexec"; then shift fi +func_gnulib_dir +func_tmpdir +trap 'exit_status=$? + if test "$signal" != EXIT; then + echo "caught signal SIG$signal" >&2 + fi + rm -rf "$tmp" + exit $exit_status' EXIT +for signal in HUP INT QUIT PIPE TERM; do + trap '{ signal='$signal'; func_exit 1; }' $signal +done +signal=EXIT + +# The 'join' program does not exist on all platforms, and +# on macOS 12.6, FreeBSD 14.0, NetBSD 9.3 it is buggy, see +# <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=232405>. +# In these cases, bail out. Otherwise, we can use it. +if (type join) >/dev/null 2>&1; then + echo a > "$tmp"/join-input-1 + { echo; echo a; } > "$tmp"/join-input-2 + if LC_ALL=C join "$tmp"/join-input-1 "$tmp"/join-input-2 | grep a >/dev/null \ + && LC_ALL=C join "$tmp"/join-input-2 "$tmp"/join-input-1 | grep a >/dev/null; then + : + else + echo "$progname: 'join' program is buggy. Consider installing GNU coreutils." >&2 + func_exit 1 + fi +else + echo "$progname: 'join' program not found. Consider installing GNU coreutils." >&2 + func_exit 1 +fi + # Unset CDPATH. Otherwise, output from 'cd dir' can surprise callers. (unset CDPATH) >/dev/null 2>&1 && unset CDPATH @@ -1690,19 +1713,6 @@ func_determine_path_separator esac } -func_gnulib_dir -func_tmpdir -trap 'exit_status=$? - if test "$signal" != EXIT; then - echo "caught signal SIG$signal" >&2 - fi - rm -rf "$tmp" - exit $exit_status' EXIT -for signal in HUP INT QUIT PIPE TERM; do - trap '{ signal='$signal'; func_exit 1; }' $signal -done -signal=EXIT - # Note: The 'eval' silences stderr output in dash. if (declare -A x && { x[f/2]='foo'; x[f/3]='bar'; eval test '${x[f/2]}' = foo; }) 2>/dev/null; then # Zsh 4 and Bash 4 have associative arrays.