The testcase (which is stripped down perl code) attached compiled with:
gcc -fPIC -fno-strict-aliasing -pipe \
-O2 \
-g -o tc-lossings-floats tc-lossings-floats.c -Wall -mno-isel
results in after executing:
|RESET: 252.000000 | 0.000000
|RESET: 504.000000 | 252.000000
|RESET: 756.000000 | 504.000000
|RESET: 1008.000000 | 756.000000
|RESET: 1260.000000 | 1008.000000
|RESET: 1512.000000 | 1260.000000
|RESET: 1764.000000 | 1512.000000
|RESET: 2016.000000 | 1764.000000
|RESET: 2268.000000 | 2016.000000
=> 2268.000000
|RESET: 2520.000000 | 0.000000
|RESET: 2772.000000 | 2520.000000
With -O1 instead -O2:
|RESET: 252.000000 | 0.000000
|RESET: 504.000000 | 252.000000
|RESET: 756.000000 | 504.000000
|RESET: 1008.000000 | 756.000000
|RESET: 1260.000000 | 1008.000000
|RESET: 1512.000000 | 1260.000000
|RESET: 1764.000000 | 1512.000000
|RESET: 2016.000000 | 1764.000000
|RESET: 2268.000000 | 2016.000000
=> 2268.000000
|RESET: 2520.000000 | 2268.000000
|RESET: 2772.000000 | 2520.000000
The "=>" line sets the the second value in the "|RESET" line. With -O1 is
remains where it is, with -O2 it gets overwritten. The original perl code gets
here a totally random values.
Now. It is getting better. The source compiled with -S and the resulting
assembly file assembled with gcc-4.3 does not show this problem. After diffing
of the two resulting binaries I saw a difference in __floatdidf() which is
called from Kino_OutStream_tell(). The variant which is attached by the 4.3
compiler does not use floating point instead it uses integer code which calls
other functions (__floatsidf, __muldf3, __floatunsidf, __adddf3) which use also
interger code. The 4.3 compiler was not compiled with --enable-e500_double.
So after looking at the code I saw now the following:
10000c24 <__floatdidf>:
10000c6c: 11 23 1a 2c evmergehi r9,r3,r3
This function is touching the complete 64bit r9 register
The code which calls it:
10000a40: 91 21 00 20 stw r9,32(r1)
10000a44: 4e 80 04 21 bctrl # tell function which in turn calls
floatdidf
10000a7c: 81 21 00 20 lwz r9,32(r1)
10000a80: 38 60 00 00 li r3,0
10000a84: 7e 33 8b 78 mr r19,r17
10000a88: 12 49 92 e1 efdsub r18,r9,r18
10000a8c: 10 80 92 fa efdctsiz r4,r18
10000a90: 12 49 4a 17 evmr r18,r9
So we just save the lower 32bit of r9 before calling the function and the upper
32bit are overwritten by efdsub.
--
Summary: Wrong code with e500 double floating point
Product: gcc
Version: 4.4.4
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: gcc at breakpoint dot cc
GCC build triplet: powerpc-linux-gnuspe
GCC host triplet: powerpc-linux-gnuspe
GCC target triplet: powerpc-linux-gnuspe
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44364