[Bug middle-end/93806] Wrong optimization: instability of floating-point results with -funsafe-math-optimizations leads to nonsense

ch3root at openwall dot com Wed, 11 Mar 2020 11:55:15 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806


--- Comment #45 from Alexander Cherepanov <ch3root at openwall dot com> ---
(In reply to Vincent Lefèvre from comment #44)
> (In reply to Alexander Cherepanov from comment #43)
> > GCC on x86-64 uses the binary encoding for the significand.
> 
> In general, yes. This includes the 32-bit ABI under Linux. But it seems to
> be different under MS-Windows, at least with MinGW using the 32-bit ABI:
> according to my tests of MPFR,
> 
> MPFR config.status 4.1.0-dev
> configured by ./configure, generated by GNU Autoconf 2.69,
>   with options "'--host=i686-w64-mingw32' '--disable-shared'
> '--with-gmp=/usr/local/gmp-6.1.2-mingw32' '--enable-assert=full'
> '--enable-thread-safe' 'host_alias=i686-w64-mingw32'"
> [...]
> CC='i686-w64-mingw32-gcc'
> [...]
> [tversion] Compiler: GCC 8.3-win32 20191201
> [...]
> [tversion] TLS = yes, float128 = yes, decimal = yes (DPD), GMP internals = no
> 
> i.e. GCC uses DPD instead of the usual BID.

Strange, I tried mingw from stable Debian on x86-64 and see it behaving the
same way as the native gcc:

$ echo 'int main() { return (union { _Decimal32 d; int i; }){0.df}.i; }' >
test.c

$ gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h return test.out
  return 847249408;
$ gcc --version | head -n 1
gcc (Debian 8.3.0-6) 8.3.0

$ x86_64-w64-mingw32-gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h
return test.out
  return 847249408;
$ x86_64-w64-mingw32-gcc --version | head -n 1
x86_64-w64-mingw32-gcc (GCC) 8.3-win32 20190406

$ i686-w64-mingw32-gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h
return test.out
  return 847249408;
$ i686-w64-mingw32-gcc --version | head -n 1
i686-w64-mingw32-gcc (GCC) 8.3-win32 20190406

Plus some other cross-compilers:

$ powerpc64-linux-gnu-gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h
return test.out
  return 575668224;
$ powerpc64-linux-gnu-gcc --version | head -n 1
powerpc64-linux-gnu-gcc (Debian 8.3.0-2) 8.3.0

$ powerpc64le-linux-gnu-gcc -O3 -fdump-tree-optimized=test.out test.c && grep
-h return test.out
  return 575668224;
$ powerpc64le-linux-gnu-gcc --version | head -n 1
powerpc64le-linux-gnu-gcc (Debian 8.3.0-2) 8.3.0

$ s390x-linux-gnu-gcc -O3 -fdump-tree-optimized=test.out test.c && grep -h
return test.out
  return 575668224;
$ s390x-linux-gnu-gcc --version | head -n 1
s390x-linux-gnu-gcc (Debian 8.3.0-2) 8.3.0

AIUI the value 847249408 (= 0x32800000) is right for 0.df with BID and
575668224 (= 0x22500000) is right with DPD.

> > So the first question: does any platform (that gcc supports) use the decimal
> > encoding for the significand (aka densely packed decimal encoding)?
> 
> DPD is also used on PowerPC (at least the 64-bit ABI), as these processors
> now have hardware decimal support.

Oh, this means that cohorts differs by platform.

> > Then, the rules about (non)propagation of some encodings blur the boundary
> > between values and representations in C. In particular this means that
> > different encodings are _not_ equivalent. Take for example the optimization
> > `x == C ? C + 0 : x` -> `x` for a constant C that is the unique member of
> > its cohort and that has non-canonical encodings (C is an infinity according
> > to the above analysis). Not sure about encoding of literals but the result
> > of addition `C + 0` is required to have canonical encoding. If `x` has
> > non-canonical encoding then the optimization is invalid.
> 
> In C, it is valid to choose any possible encoding. Concerning the IEEE 754
> conformance, this depends on the bindings. But IEEE 754 does not define the
> ternary operator. It depends whether C considers encodings before or
> possibly after optimizations (in the C specification, this does not matter,
> but when IEEE 754 is taken into account, there may be more restrictions).

The ternary operator is not important, let's replace it with `if`:

----------------------------------------------------------------------
#include <math.h>

_Decimal32 f(_Decimal32 x)
{
    _Decimal32 inf = (_Decimal32)INFINITY + 0;

    if (x == inf)
        return inf;
    else
        return x;
}
----------------------------------------------------------------------

This is optimized into just `return x;`.

> > While at it, convertFormat is required to return canonical encodings, so
> > after `_Decimal32 x = ..., y = (_Decimal32)(_Decimal64)x;` `y` has to have
> > canonical encoding? But these casts are nop in gcc now.
> 
> A question is whether casts are regarded as explicit convertFormat
> operations 

N2478, a recent draft of C2x, lists bindings in F.3 and "convertFormat -
different formats" corresponds to "cast and implicit conversions". Is this
enough?

BTW "convertFormat - same format" corresponds to "canonicalize", so I guess a
cast to the same type is not required to canonicalize.

> or whether simplification is allowed as it does not affect the
> value, in which case the canonicalize() function would be needed here. 

Not sure what this means.

> And
> in any case, when FP contraction is enabled, I suppose that
> (_Decimal32)(_Decimal64)x can be regarded as x.

Perhaps this PR is not ideal for such discussions but the problems with DFP
that we see happen in a standards-compliant mode.

[Bug middle-end/93806] Wrong optimization: instability of floating-point results with -funsafe-math-optimizations leads to nonsense

Reply via email to