On aarch64 target the result are 1.332268e-17. On x86 with fma target the result are also 1.332268e-17.
so, I don't think the Loongson's madd.fmt/msub.fmt is incorrect. We should do something for usage of fused madd, the all things has been tested an fedora21 remix for loongson(1). 1, gcc:add loongson to the list of targets with a fused implementation. 2, glibc:__fma() function,we should use madd.fmt like aarch64. see patch: http://www.loongnix.org/cgit/glibc-2.20/commit/?id=14023742e6ef571b61439d0d7bb7939e663fe624 3, kernel: the emulation when a float exception taken. (1),http://www.loongnix.org/index.php/Loongnix-20161130%E7%89%88%E6%9C%AC%E5%9C%A8Fedora21_remix%E7%B3%BB%E7%BB%9F%E4%B8%8A%E5%8F%91%E5%B8%83 Thanks, Paul On Fri, Dec 23, 2016 at 12:38 AM, Yunqiang Su <yunqiang...@imgtec.com> wrote: > >> 在 2016年12月23日,00:18,Richard Sandiford <rdsandif...@googlemail.com> 写道: >> >> Yunqiang Su <yunqiang...@imgtec.com> writes: >>>> 在 2016年12月22日,23:48,Yunqiang Su <yunqiang...@imgtec.com> 写道: >>>> >>>>> >>>>> 在 2016年12月22日,23:31,Richard Sandiford >>>>> <rdsandif...@googlemail.com> 写道: >>>>> >>>>> Matthew Fortune <matthew.fort...@imgtec.com> writes: >>>>>> Sandra Loosemore <san...@codesourcery.com> writes: >>>>>>> On 12/21/2016 11:54 AM, Yunqiang Su wrote: >>>>>>>> By this patch, I add a build-time option ` >>>>>>>> --with-unfused-madd4=yes/no', >>>>>>>> and runtime option -m(no-)unfused-madd4, >>>>>>>> to disable generate madd.fmt instructions. >>>>>>> >>>>>>> Your patch also needs a documentation change so that the new >>>>>>> command-line option is listed in the GCC manual with other MIPS target >>>>>>> options. >>>>>> >>>>>> Any opinions on option names to control this? Is it best to target >>>>>> the specific >>>>>> feature that is non-compliant on loongson or apply a general >>>>>> -mfix-loongson >>>>>> type option? >>>>>> >>>>>> I'm not sure I have a strong opinion either way but there do seem to be >>>>>> multiple possible variants. >>>>> >>>>> Wasn't sure from this thread whether Loongson simply had a fused >>>>> implementation (without intermediate rounding) or whether the >>>>> instructions gave numerically incorrect results for some inputs. >>>> >>>> I test to define ISA_HAS_FUSED_MADD4 true and >>>> define ISA_HAS_UNFUSED_MADD4 false, and try to build a test case. >>>> With ISA_HAS_FUSED_MADD4, the result is about 1e-17, >>>> and with ISA_HAS_UNFUSED_MADD4, the result is about 1e-17, >>>> both of the are incorrect (the expect value is 0). >>>> >>>> The test case is >>>> >>>> #include <stdio.h> >>>> >>>> double a = 0.6; >>>> double b = 0.4; >>>> double c = 0.6; >>>> double d = 0.4; >>>> >>>> int main(void) >>>> { >>>> double x = a * b - c * d; >>>> printf("%le\n", x); >>>> return 0; >>>> } >>>> >>>> >>>>> It sounds from a later thread like it's generating incorrect results, >>>>> is that right? If so, then FWIW I agree an -mfix option would be more >>>>> consistent. E.g. one of the -mfix-vr4120 errata was an incorrect >>>>> integer division result and one of the -mfix-sb1 errata was an incorrect >>>>> single-precision float division result. The latter case could have been >>>>> handled by an option to disable DIV.S and DIV.PS, but the -mfix option >>>>> gave more control. >>>>> >>>>> If instead the problem is that the instructions are fused then that's >>>>> also what the original MIPS 4 parts did, so maybe an option to control >>>>> fusedness would make sense. >>>> >>>> The result to thread it fused or unfused, is different, while neither of >>>> them >>>> is correct. >>> >>> ohh, the result are same, and neither is correct. >>> both of them are 1.332268e-17. >> >> That's the expected result for an implementation in which the subtraction >> is fused with the first multiplication without intermediate rounding. >> The second 0.4 * 0.6 isn't exactly representable and rounds down, to a >> value slightly less than 0.24. Then the fused operation subtracts this >> value from the exact result of the first 0.4 * 0.6 (0.24), giving a >> value slightly greater than 0. > > I see. But I think 1.332268e-17 is too big than expected. > 1-e17 for double is totally unacceptable. > >> >> I see Paul Hua's patch does add Loongson to the list of targets >> with a fused implementation (should have checked earlier, sorry). >> So I think after that patch we would do the right thing. (In particular, >> -ffp-contract=off would then disable the fusing.) > > will -ffp-contract=off disable some other optimization? > If so, I don’t think that will be an ideal choice for distributions, like > Debian. > >> >> Thanks, >> Richard >