Re: [PATCH] totalorder, totalorderf, totalorderl: new modules

Bruno Haible Wed, 04 Oct 2023 03:59:54 -0700

> * Solaris
>   x86_64: ABIs 32, 64 on Solaris 10..., Solaris 11.4
>           except ABI 64 with cc on Solaris 10: FAIL: test-totalorderl


The cause of this test failure is that the initialization of xu, yu, zu
in

  union { unsigned long long i[2]; long double f; } volatile
    xu = {0}, yu = {0}, zu = {0};

does not happen as expected.

The compiler is
  cc: Sun C 5.8 Patch 121016-08 2009/04/20
and I'm using
  CC="cc -D_STDC_C99= -xarch=generic64 -O"

I tracked this down
  1) by printing which (i,j) pairs in the loop fail the assertion,
     namely usually i=11, j=11,
  2) by adding an assertion before the nested loop:
       ASSERT (!!TOTALORDER (&x[11], &x[11]));
  3) by single-stepping this invocation at the instruction level.

This piece of code:

  union { unsigned long long i[2]; long double f; } volatile
    xu = {0}, yu = {0}, zu = {0};
  xu.f = *x;
  yu.f = *y;

produces this sequence of instructions:

  401132:       db 2d 88 02 00 00       fldt   0x288(%rip)        # 4013c0 
<Drodata.rodata+0x20>
  401138:       db 7c 24 40             fstpt  0x40(%rsp)
  40113c:       db 2d 6e 02 00 00       fldt   0x26e(%rip)        # 4013b0 
<Drodata.rodata+0x10>
  401142:       db 7c 24 50             fstpt  0x50(%rsp)
  401146:       db 2d 54 02 00 00       fldt   0x254(%rip)        # 4013a0 
<Drodata.rodata>
  40114c:       db 7c 24 60             fstpt  0x60(%rsp)
  401150:       db 7c 24 40             fstpt  0x40(%rsp)
  401154:       db 7c 24 50             fstpt  0x50(%rsp)

and after their execution, I see that the upper 48 bits of each of xu, yu, zu
are not the expected zero bits:

(gdb) print *(void*(*)[2])($rsp+0x40)
$10 = {0xc000000000000000, 0xfffffd7fff217fff}
(gdb) print *(void*(*)[2])($rsp+0x50)
$11 = {0xc000000000000000, 0x407fff}
(gdb) print *(void*(*)[2])($rsp+0x60)
$12 = {0x0, 0xbee4f8b588e30000}

This patch makes the test succeed, and shouldn't slow it down on the other
platforms.


2023-10-04  Bruno Haible  <br...@clisp.org>

        totalorderl: Work around Solaris cc bug.
        * lib/totalorderl.c (totalorderl): Initialize xu, yu, zu using a
        different syntax.

diff --git a/lib/totalorderl.c b/lib/totalorderl.c
index 03aba3adf7..84fed3c4d1 100644
--- a/lib/totalorderl.c
+++ b/lib/totalorderl.c
@@ -26,17 +26,28 @@ totalorderl (long double const *x, long double const *y)
   /* Although the following is not strictly portable, and won't work
      on some obsolete platforms (e.g., PA-RISC, MIPS before alignment
      to IEEE 754-2008), it should be good enough nowadays.  */
+
+  /* If the sign bits of *X and *Y differ, the one with sign bit == 1
+     is "smaller" than the one with sign bit == 0.  */
   int xs = signbit (*x);
   int ys = signbit (*y);
   if (!xs != !ys)
     return xs;
+
+  /* If one of *X, *Y is a NaN and the other isn't, the answer is easy
+     as well: the negative NaN is "smaller", the positive NaN is "greater"
+     than the other argument.  */
   int xn = isnanl (*x);
   int yn = isnanl (*y);
   if (!xn != !yn)
     return !xn == !xs;
+  /* If none of *X, *Y is a NaN, the '<=' operator does the job, including
+     for -Infinity and +Infinity.  */
   if (!xn)
     return *x <= *y;
 
+  /* At this point, *X and *Y are NaNs with the same sign bit.  */
+
   unsigned long long extended_sign = -!!xs;
 
   if (sizeof (long double) <= sizeof (unsigned long long))
@@ -48,12 +59,23 @@ totalorderl (long double const *x, long double const *y)
       return (xu.i ^ extended_sign) <= (yu.i ^ extended_sign);
     }
 
-  union { unsigned long long i[2]; long double f; } volatile
-    xu = {0}, yu = {0}, zu = {0};
+  union { unsigned long long i[2]; long double f; } volatile xu, yu, zu;
+  /* It would be tempting to initialize xu, yu, zu with {0}.  But
+     Solaris cc (Sun C 5.8) on x86_64 miscompiles it: It initializes
+     only the lower 80 bits, not the entire 128 bits, of each of the
+     three variables.  */
+  xu.i[0] = 0; xu.i[1] = 0;
+  yu.i[0] = 0; yu.i[1] = 0;
+  zu.i[0] = 0; zu.i[1] = 0;
+
   xu.f = *x;
   yu.f = *y;
+
+  /* Determine in which of the two 'unsigned long long' words the sign bit
+     is located.  */
   zu.f = -zu.f;
   bool bigendian = !!zu.i[0];
+
   unsigned long long
     xhi = xu.i[!bigendian] ^ extended_sign,
     yhi = yu.i[!bigendian] ^ extended_sign,

Re: [PATCH] totalorder, totalorderf, totalorderl: new modules

Reply via email to