On 27/03/2024 12:03, Bruno Haible wrote:
The tests/od/od-multiple-t.sh test fails on
- Linux/ia64 (Gentoo, recent glibc, real hardware)
- Linux/m68k (Debian 12, QEMU emulated)
The first 'od' invocation that causes fail=1 to be set is:
$ seq 19 > input
$ ./od -An -taz -tfLz input
1 nl 2 nl 3 nl 4 nl 5 nl 6 nl >1.2.3.4.5.6.<
Fatal glibc error: printf_fp.c:501 (__printf_fp_buffer_1): assertion failed: cy == 1 ||
(p.frac[p.fracsize - 2] == 0 && p.frac[0] == 0)
Aborted
Debugging this on Linux/ia64:
The stack trace from this abort is:
snprintf () from /lib/libc.so.6.1
ldtoastr (buf=0x600ffffffffee9e0 "\340\t\004", bufsize=45, flags=0, width=0,
x=<invalid float value>) at ../lib/ftoastr.c:145
print_long_double (fields=1, blank=0, block=0x20000008000409d0,
fmt_string=0x2000000800040988 "O-8859-1\n#\001f\035", width=29, pad=35) at
../src/od.c:514
...
main
and it is this snprintf invocation that crashes.
AFAICS, the problem is that in the function print_long_double,
in line 465, a (const void *) gets cast to a 'const long double *',
then a 'long double' value gets fetched from this location.
gdb displays it as <invalid float value>.
And it is during this snprintf:
145 int n = snprintf (buf, bufsize, format, width, prec, promoted_x);
45 "%*.*Lg" 0 18 <invalid float
value>
that snprintf crashes.
In other words, the test invokes undefined behaviour. And glibc
rightfully crashes. Cf. [1]
"Formatting noncanonical ‘long double’ numbers produces nonmeaningful
results on some platforms: glibc and others, on x86, x86_64, IA-64 CPUs."
Please fix the unit test. It is time-consuming to debug because gdb is
not well usable on these platforms.
Bruno
[1] https://www.gnu.org/software/gnulib/manual/html_node/snprintf.html
It's a bit surprising that it aborts here, but yes we should avoid the
undefined behavior.
The original bug report was not specific to floats, so I'll just remove them
from the test.
For reference the test was added in 2008 with:
https://github.com/coreutils/coreutils/commit/46a811b9e
thanks!
Pádraig