Re: [match.pd] PR83750 - CSE erf/erfc pair

Richard Biener via Gcc-patches Fri, 22 Oct 2021 02:26:36 -0700

On Fri, 22 Oct 2021, Prathamesh Kulkarni wrote:

> On Wed, 20 Oct 2021 at 18:21, Richard Biener <rguent...@suse.de> wrote:
> >
> > On Wed, 20 Oct 2021, Prathamesh Kulkarni wrote:
> >
> > > On Tue, 19 Oct 2021 at 16:55, Richard Biener <rguent...@suse.de> wrote:
> > > >
> > > > On Tue, 19 Oct 2021, Prathamesh Kulkarni wrote:
> > > >
> > > > > On Tue, 19 Oct 2021 at 13:02, Richard Biener 
> > > > > <richard.guent...@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Oct 19, 2021 at 9:03 AM Prathamesh Kulkarni via Gcc-patches
> > > > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > > > >
> > > > > > > On Mon, 18 Oct 2021 at 17:23, Richard Biener <rguent...@suse.de> 
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > >
> > > > > > > > > On Mon, 18 Oct 2021 at 17:10, Richard Biener 
> > > > > > > > > <rguent...@suse.de> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > > >
> > > > > > > > > > > On Mon, 18 Oct 2021 at 16:18, Richard Biener 
> > > > > > > > > > > <rguent...@suse.de> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Richard,
> > > > > > > > > > > > > As suggested in PR, I have attached WIP patch that 
> > > > > > > > > > > > > adds two patterns
> > > > > > > > > > > > > to match.pd:
> > > > > > > > > > > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > > > > > > > > > > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > > > > > > > > > > > >
> > > > > > > > > > > > > This works to remove call to erfc for the following 
> > > > > > > > > > > > > test:
> > > > > > > > > > > > > double f(double x)
> > > > > > > > > > > > > {
> > > > > > > > > > > > >   double g(double, double);
> > > > > > > > > > > > >
> > > > > > > > > > > > >   double t1 = __builtin_erf (x);
> > > > > > > > > > > > >   double t2 = __builtin_erfc (x);
> > > > > > > > > > > > >   return g(t1, t2);
> > > > > > > > > > > > > }
> > > > > > > > > > > > >
> > > > > > > > > > > > > with .optimized dump shows:
> > > > > > > > > > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > > > > > > > > > >   t2_3 = 1.0e+0 - t1_2;
> > > > > > > > > > > > >
> > > > > > > > > > > > > However, for the following test:
> > > > > > > > > > > > > double f(double x)
> > > > > > > > > > > > > {
> > > > > > > > > > > > >   double g(double, double);
> > > > > > > > > > > > >
> > > > > > > > > > > > >   double t1 = __builtin_erfc (x);
> > > > > > > > > > > > >   return t1;
> > > > > > > > > > > > > }
> > > > > > > > > > > > >
> > > > > > > > > > > > > It canonicalizes erfc(x) to 1 - erf(x), but does not 
> > > > > > > > > > > > > transform 1 -
> > > > > > > > > > > > > erf(x) to erfc(x) again
> > > > > > > > > > > > > post canonicalization.
> > > > > > > > > > > > > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) 
> > > > > > > > > > > > > gets applied,
> > > > > > > > > > > > > but then it tries to
> > > > > > > > > > > > > resimplify erfc(x), which fails post 
> > > > > > > > > > > > > canonicalization. So we end up
> > > > > > > > > > > > > with erfc(x) transformed to
> > > > > > > > > > > > > 1 - erf(x) in .optimized dump, which I suppose isn't 
> > > > > > > > > > > > > ideal.
> > > > > > > > > > > > > Could you suggest how to proceed ?
> > > > > > > > > > > >
> > > > > > > > > > > > I applied your patch manually and it does the intended
> > > > > > > > > > > > simplifications so I wonder what I am missing?
> > > > > > > > > > > Would it be OK to always fold erfc(x) -> 1 - erf(x) even 
> > > > > > > > > > > when there's
> > > > > > > > > > > no erf(x) in the source ?
> > > > > > > > > >
> > > > > > > > > > I do think it's reasonable to expect erfc to be available 
> > > > > > > > > > when erf
> > > > > > > > > > is and vice versa but note both are C99 specified functions 
> > > > > > > > > > (either
> > > > > > > > > > requires -lm).
> > > > > > > > > OK, thanks. Would it be OK to commit the patch after 
> > > > > > > > > bootstrap+test ?
> > > > > > > >
> > > > > > > > Yes, but I'm confused because you say the patch doesn't work 
> > > > > > > > for you?
> > > > > > > The patch works for me to CSE erf/erfc pair.
> > > > > > > However when there's only erfc in the source, it canonicalizes 
> > > > > > > erfc(x)
> > > > > > > to 1 - erf(x) but later fails to uncanonicalize 1 - erf(x) back to
> > > > > > > erfc(x)
> > > > > > > with -O3 -funsafe-math-optimizations.
> > > > > > >
> > > > > > > For,
> > > > > > > t1 = __builtin_erfc(x),
> > > > > > >
> > > > > > > .optimized dump shows:
> > > > > > >   _2 = __builtin_erf (x_1(D));
> > > > > > >   t1_3 = 1.0e+0 - _2;
> > > > > > >
> > > > > > > and for,
> > > > > > > double t1 = x + __builtin_erfc(x);
> > > > > > >
> > > > > > > .optimized dump shows:
> > > > > > >   _3 = __builtin_erf (x_2(D));
> > > > > > >   _7 = x_2(D) + 1.0e+0;
> > > > > > >   t1_4 = _7 - _3;
> > > > > > >
> > > > > > > I assume in both cases, we want erfc in the code-gen instead ?
> > > > > > > I think the reason uncaonicalization fails is because the pattern 
> > > > > > > 1 -
> > > > > > > erf(x) to erfc(x)
> > > > > > > gets applied, but then it fails in resimplifying erfc(x), and we 
> > > > > > > end
> > > > > > > up with 1 - erf(x) in code-gen.
> > > > > > >
> > > > > > > From gimple-match.c, it hits the simplification:
> > > > > > >
> > > > > > >                                 gimple_seq *lseq = seq;
> > > > > > >                                 if (__builtin_expect (!dbg_cnt
> > > > > > > (match), 0)) goto next_after_fail1172;
> > > > > > >                                 if (__builtin_expect (dump_file &&
> > > > > > > (dump_flags & TDF_FOLDING), 0)) fprintf (dump_file, "Applying 
> > > > > > > pattern
> > > > > > > %s:%d, %s:%d\n", "match.pd", 6162, __FILE__, __LINE__);
> > > > > > >                                 {
> > > > > > >                                   res_op->set_op 
> > > > > > > (CFN_BUILT_IN_ERFC, type, 1);
> > > > > > >                                   res_op->ops[0] = captures[0];
> > > > > > >                                   res_op->resimplify (lseq, 
> > > > > > > valueize);
> > > > > > >                                   return true;
> > > > > > >                                 }
> > > > > > >
> > > > > > > But res_op->resimplify returns false, and doesn't end up adding 
> > > > > > > to lseq.
> > > > > >
> > > > > > There's nothing to add to lseq since there's also nothing to 
> > > > > > resimplify.
> > > > > > The only thing that could happen is that the replacement is not done
> > > > > > because replace_stmt_with_simplification via maybe_push_res_to_seq
> > > > > > doesn't pass the builtin_decl_implicit test:
> > > > > >
> > > > > >           /* Find the function we want to call.  */
> > > > > >           tree decl = builtin_decl_implicit (as_builtin_fn (fn));
> > > > > >           if (!decl)
> > > > > >             return NULL;
> > > > > >
> > > > > > btw, it did work for me since the call was present before and 
> > > > > > gimplification
> > > > > > should then mark the function eligible for implicit generation.
> > > > > >
> > > > > > > As you suggest, should we instead handle this in fre to transform
> > > > > > > erfc(x) to 1 - erf(x), only when
> > > > > > > there's a matching erf(x) in the source ?
> > > > > >
> > > > > > Note that's strictly less powerful and we'd have to handle erf(x) 
> > > > > > -> 1 +erfc (x)
> > > > > > to handle CSE in
> > > > > >
> > > > > >   tem = erfc (x);
> > > > > >   tem2 = erf (x);
> > > > > >
> > > > > > So no, I think the canonicalization is fine unless there's a 
> > > > > > compelling reason
> > > > > > for having both erfc and erf.
> > > > > >
> > > > > > Can you debug why the reverse transform doesn't work for you?
> > > > > It seems the issue was that erfc wasn't getting marked with const
> > > > > attribute, and failed the following test in
> > > > > maybe_push_res_to_seq:
> > > > >           /* We can't and should not emit calls to non-const 
> > > > > functions.  */
> > > > >           if (!(flags_from_decl_or_type (decl) & ECF_CONST))
> > > > >             return NULL;
> > > > >
> > > > > Passing -fno-math-errno seems to work for the reverse transform:
> > > > >
> > > > > double f(double x)
> > > > > {
> > > > >   double g(double, double);
> > > > >
> > > > >   double t1 = __builtin_erfc (x);
> > > > >   return t1;
> > > > > }
> > > > >
> > > > > Compiling with -O3 -funsafe-math-optimizations -fno-math-errno:
> > > > >
> > > > > vrp2 dump shows:
> > > > > Folding statement: _2 = __builtin_erf (x_1(D));
> > > > > Not folded
> > > > > Folding statement: t1_3 = 1.0e+0 - _2;
> > > > > Applying pattern match.pd:6162, gimple-match.c:68450
> > > > > gimple_simplified to t1_3 = __builtin_erfc (x_1(D));
> > > > > Folded into: t1_3 = __builtin_erfc (x_1(D));
> > > > >
> > > > > and .optimized dump shows:
> > > > > double f (double x)
> > > > > {
> > > > >   double t1;
> > > > >
> > > > >   <bb 2> [local count: 1073741824]:
> > > > >   t1_3 = __builtin_erfc (x_1(D)); [tail call]
> > > > >   return t1_3;
> > > > > }
> > > > >
> > > > > Unfortunately, for the test-case involving erf/erfc pair, the reverse
> > > > > transform seems to undo the CSE:
> > > > >
> > > > > double f(double x)
> > > > > {
> > > > >   double g(double, double);
> > > > >
> > > > >   double t1 = __builtin_erf (x);
> > > > >   double t2 = __builtin_erfc (x);
> > > > >   return g(t1, t2);
> > > > > }
> > > > >
> > > > > gimplification turns erfc to 1 - erf:
> > > > > Applying pattern match.pd:6158, gimple-match.c:44479
> > > > > gimple_simplified to D.1988 = __builtin_erf (x);
> > > > > t2 = 1.0e+0 - D.1988;
> > > > >
> > > > >   t1 = __builtin_erf (x);
> > > > >   D.1988 = __builtin_erf (x);
> > > > >   t2 = 1.0e+0 - D.1988;
> > > > >   D.1987 = g (t1, t2);
> > > > >
> > > > > fre1 does the CSE:
> > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > >   t2_4 = 1.0e+0 - t1_2;
> > > > >   _7 = g (t1_2, t2_4);
> > > > >
> > > > > and forwprop4 again converts 1 - erf(x) to erfc(x), "undoing" the CSE:
> > > > > Applying pattern match.pd:6162, gimple-match.c:68450
> > > > > gimple_simplified to t2_3 = __builtin_erfc (x_1(D));
> > > > >
> > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > >   t2_3 = __builtin_erfc (x_1(D));
> > > > >   _6 = g (t1_2, t2_3);
> > > > >
> > > > > and .optimized dump shows:
> > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > >   t2_3 = __builtin_erfc (x_1(D));
> > > > >   _6 = g (t1_2, t2_3); [tail call]
> > > >
> > > > You probably want an explicit && single_use () check on the
> > > > 1 - erf() -> erfc transform.
> > > single_use worked, thanks!
> > > As you pointed out, we reassociate x + erfc(x) to (x + 1) - erf(x) and
> > > don't uncanonicalize back.
> > > I added another pattern to reassociate (x + 1) - erf(x) to x + (1 -
> > > erf(x)), after which,
> > > it gets uncanonicalized back to x + erfc(x) in .optimized dump.
> >
> > I think that's not the way to "fix" this case, instead if we'd care
> > we should change reassoc to either associate a -erf() operand and
> > a 1 next to each other or recognize it there.  But I think that
> > can be done as followup.
> >
> > > Does the attached patch look OK after bootstrap+test ?
> >
> > It's OK with removing the extra (x+1) - erf pattern and the
> > associated testcase.
> Thanks. It regressed builtin-nonneg-1.c because for following case:
> 
> void test(double d)
> {
>   if (signbit (__builtin_erfc (d)))
>     link_failure_erfc ();
> }
> 
> before patch signbit (erfc (d)) was transformed to 0, since erfc
> returns a non-negative result.
> However after patch, erfc (d), transformed to 1 - erf(d), and in
> optimized dump we get:
> 
>   <bb 2> [local count: 1073741824]:
>   _3 = __builtin_erf (d_2(D));
>   if (_3 > 1.0e+0)
>     goto <bb 3>; [41.48%]
>   else
>     goto <bb 4>; [58.52%]
> 
>   <bb 3> [local count: 445388112]:
>   link_failure_erfc (); [tail call]
> 
>   <bb 4> [local count: 1073741824]:
>   return;
> 
> The attached patch adds transform erf(x) > 1 --> 0 which removes the
> redundant test and call to link_failure_erfc.
> I assume the transform is OK since erf's range is [-1, 1] ?


+ /* Simplify erf(x) > 1 to 0.  */
+ (simplify
+  (gt (ERF @0) real_onep)
+  { boolean_false_node; }))

most definitely this is too special - it doesn't work for erf(x) > 1.1
for example.  Ideally we'd have value-ranges on floats and that would
deal with such cases.  Note that erf can return NaN (for NaN input),
in which case gt would raise an exception but unge would not.
So the issue is that folding the gt to false would eventually remove
an exception.  Conditionalizing this on HONOR_NANS (@0) would work
around this.  And of course use sth like

 (gt (ERF @0) REAL_CST@1)
 (if (!HONOR_NANS (@0)
      && real_compare (GE_EXPR, @1, dconst1))
  { boolean_false_node; })

that said, I'm not convinced this is the very best idea - maybe
we should revisit the way do CSE erf and erfc (was there a
real-world usecase for this btw?)

Richard.

> Thanks,
> Prathamesh
> >
> > Richard.
> >
> > > Thanks,
> > > Prathamesh
> > > >
> > > > > Thanks,
> > > > > Prathamesh
> > > > > >
> > > > > > Richard.
> > > > > >
> > > > > > > Thanks,
> > > > > > > Prathamesh
> > > > > > > >
> > > > > > > > Btw, please add the testcase from the PR and also a testcase 
> > > > > > > > that shows
> > > > > > > > the canonicalization is undone.  Maybe you can also 
> > > > > > > > double-check that
> > > > > > > > we handle x + erfc (x) because I see we associate that as
> > > > > > > > (x + 1) - erf (x) which is then not recognized back to erfc.
> > > > > > > >
> > > > > > > > The less surprising (as to preserve the function called in the 
> > > > > > > > source)
> > > > > > > > variant for the PR would be to teach CSE to lookup erf(x) when
> > > > > > > > visiting erfc(x) and when found synthesize 1 - erf(x).
> > > > > > > >
> > > > > > > > That said, a mathematician should chime in on how important it 
> > > > > > > > is
> > > > > > > > to preserve erfc vs. erf (precision or even speed).
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Richard.
> > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Prathamesh
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Richard.
> > > > > > > > > >
> > > > > > > > > > > So for the following test:
> > > > > > > > > > > double f(double x)
> > > > > > > > > > > {
> > > > > > > > > > >   t1 = __builtin_erfc(x)
> > > > > > > > > > >   return t1;
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > .optimized dump shows:
> > > > > > > > > > > double f (double x)
> > > > > > > > > > > {
> > > > > > > > > > >   double t1;
> > > > > > > > > > >   double _2;
> > > > > > > > > > >
> > > > > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > > > > >   _2 = __builtin_erf (x_1(D));
> > > > > > > > > > >   t1_3 = 1.0e+0 - _2;
> > > > > > > > > > >   return t1_3;
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > while before patch, it has:
> > > > > > > > > > >   t1_4 = __builtin_erfc (x_2(D)); [tail call]
> > > > > > > > > > >   return t1_4;
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Prathamesh
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Richard.
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > Prathamesh
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Richard Biener <rguent...@suse.de>
> > > > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 
> > > > > > > > > > > > 90409 Nuernberg,
> > > > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Richard Biener <rguent...@suse.de>
> > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 
> > > > > > > > > > 90409 Nuernberg,
> > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Richard Biener <rguent...@suse.de>
> > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 
> > > > > > > > Nuernberg,
> > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguent...@suse.de>
> > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > >
> >
> > --
> > Richard Biener <rguent...@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Re: [match.pd] PR83750 - CSE erf/erfc pair

Reply via email to