https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583

--- Comment #15 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
(In reply to rguent...@suse.de from comment #14)
> On Wed, 7 Feb 2024, juzhe.zhong at rivai dot ai wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
> > 
> > --- Comment #13 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
> > Ok. I found the optimized tree:
> > 
> > 
> >   _5 = 3.33333333333333314829616256247390992939472198486328125e-1 - _4;
> >   _8 = .FMA (_5, 1.229999999999999982236431605997495353221893310546875e-1, 
> > _4);
> > 
> > Let CST0 = 3.33333333333333314829616256247390992939472198486328125e-1,
> > CST1 = 1.229999999999999982236431605997495353221893310546875e-1
> > 
> > The expression is equivalent to the following:
> > 
> > _5 = CST0 - _4;
> > _8 = _5 * CST1 + 4;
> > 
> > That is:
> > 
> > _8 = (CST0 - _4) * CST1 + 4;
> > 
> > So, We should be able to re-associate it like Clang:
> > 
> > _8 = CST0 * CST1 - _4 * CST1 + 4; ---> _8 = CST0 * CST1 + _4 * (1 - CST1);
> > 
> > Since both CST0 * CST1 and 1 - CST1 can be pre-computed during compilation
> > time.
> > 
> > Let say CST2 = CST0 * CST1, CST3 = 1 - CST1, then we can re-associate as 
> > Clang:
> > 
> > _8 = FMA (_4, CST3, CST2).
> > 
> > Any suggestions for this re-association ?  Is match.pd the right place to 
> > do it
> > ?
> 
> You need to look at the IL before we do .FMA forming, specifically 
> before/after the late reassoc pass.  There pass applying match.pd
> patterns everywhere is forwprop.
> 
> I also wonder which compilation flags you are using (note clang
> has different defaults for example for -ftrapping-math)

Both GCC and Clang are using   -Ofast -ffast-math.

Reply via email to