https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116891

--- Comment #6 from Walter Mascarenhas <walter.mascarenhas at gmail dot com> ---
Hi Andrew,

   The proper optimization in this case would be to use the instruction
vfnmsub132pd followed by a change of sign. It could be something like

fma_ru:
vfnmsub132pd xmm0, xmm2, xmm1
vmovddup xmm1, QWORD PTR .LC1[rip]
vxorpd xmm0, xmm0, xmm1
ret
.LC1:
.long 0
.long -2147483648



On Mon, Sep 30, 2024 at 3:30 PM pinskia at gcc dot gnu.org <
gcc-bugzi...@gcc.gnu.org> wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116891
>
> Andrew Pinski <pinskia at gcc dot gnu.org> changed:
>
>            What    |Removed                     |Added
>
> ----------------------------------------------------------------------------
>             Summary|[12/13/14/15 Regression]    |[12/13/14/15 Regression]
>                    |invalid optimization of     |invalid optimization of
>                    |-fma(-x,y,-z) when -03 and  |-fma(-x,x,-z) when -03 and
>                    |-frounding-math are used    |-frounding-math are used
>
> --- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> Simplified testcase:
> ```
> double f(double ae, double ax)
> {
>   return -__builtin_fma( -ax, ax, -ae );
> }
> ```
>
> That is pushing the outer negate into FMA is not valid due to
> addition/multiply
> being in infinite precision.
>
> --
> You are receiving this mail because:
> You reported the bug.

Reply via email to