On Wed, 20 Oct 2021 at 18:21, Richard Biener <rguent...@suse.de> wrote:
>
> On Wed, 20 Oct 2021, Prathamesh Kulkarni wrote:
>
> > On Tue, 19 Oct 2021 at 16:55, Richard Biener <rguent...@suse.de> wrote:
> > >
> > > On Tue, 19 Oct 2021, Prathamesh Kulkarni wrote:
> > >
> > > > On Tue, 19 Oct 2021 at 13:02, Richard Biener 
> > > > <richard.guent...@gmail.com> wrote:
> > > > >
> > > > > On Tue, Oct 19, 2021 at 9:03 AM Prathamesh Kulkarni via Gcc-patches
> > > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > > >
> > > > > > On Mon, 18 Oct 2021 at 17:23, Richard Biener <rguent...@suse.de> 
> > > > > > wrote:
> > > > > > >
> > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > >
> > > > > > > > On Mon, 18 Oct 2021 at 17:10, Richard Biener 
> > > > > > > > <rguent...@suse.de> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > >
> > > > > > > > > > On Mon, 18 Oct 2021 at 16:18, Richard Biener 
> > > > > > > > > > <rguent...@suse.de> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 18 Oct 2021, Prathamesh Kulkarni wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Richard,
> > > > > > > > > > > > As suggested in PR, I have attached WIP patch that adds 
> > > > > > > > > > > > two patterns
> > > > > > > > > > > > to match.pd:
> > > > > > > > > > > > erfc(x) --> 1 - erf(x) if canonicalize_math_p() and,
> > > > > > > > > > > > 1 - erf(x) --> erfc(x) if !canonicalize_math_p().
> > > > > > > > > > > >
> > > > > > > > > > > > This works to remove call to erfc for the following 
> > > > > > > > > > > > test:
> > > > > > > > > > > > double f(double x)
> > > > > > > > > > > > {
> > > > > > > > > > > >   double g(double, double);
> > > > > > > > > > > >
> > > > > > > > > > > >   double t1 = __builtin_erf (x);
> > > > > > > > > > > >   double t2 = __builtin_erfc (x);
> > > > > > > > > > > >   return g(t1, t2);
> > > > > > > > > > > > }
> > > > > > > > > > > >
> > > > > > > > > > > > with .optimized dump shows:
> > > > > > > > > > > >   t1_2 = __builtin_erf (x_1(D));
> > > > > > > > > > > >   t2_3 = 1.0e+0 - t1_2;
> > > > > > > > > > > >
> > > > > > > > > > > > However, for the following test:
> > > > > > > > > > > > double f(double x)
> > > > > > > > > > > > {
> > > > > > > > > > > >   double g(double, double);
> > > > > > > > > > > >
> > > > > > > > > > > >   double t1 = __builtin_erfc (x);
> > > > > > > > > > > >   return t1;
> > > > > > > > > > > > }
> > > > > > > > > > > >
> > > > > > > > > > > > It canonicalizes erfc(x) to 1 - erf(x), but does not 
> > > > > > > > > > > > transform 1 -
> > > > > > > > > > > > erf(x) to erfc(x) again
> > > > > > > > > > > > post canonicalization.
> > > > > > > > > > > > -fdump-tree-folding shows that 1 - erf(x) --> erfc(x) 
> > > > > > > > > > > > gets applied,
> > > > > > > > > > > > but then it tries to
> > > > > > > > > > > > resimplify erfc(x), which fails post canonicalization. 
> > > > > > > > > > > > So we end up
> > > > > > > > > > > > with erfc(x) transformed to
> > > > > > > > > > > > 1 - erf(x) in .optimized dump, which I suppose isn't 
> > > > > > > > > > > > ideal.
> > > > > > > > > > > > Could you suggest how to proceed ?
> > > > > > > > > > >
> > > > > > > > > > > I applied your patch manually and it does the intended
> > > > > > > > > > > simplifications so I wonder what I am missing?
> > > > > > > > > > Would it be OK to always fold erfc(x) -> 1 - erf(x) even 
> > > > > > > > > > when there's
> > > > > > > > > > no erf(x) in the source ?
> > > > > > > > >
> > > > > > > > > I do think it's reasonable to expect erfc to be available 
> > > > > > > > > when erf
> > > > > > > > > is and vice versa but note both are C99 specified functions 
> > > > > > > > > (either
> > > > > > > > > requires -lm).
> > > > > > > > OK, thanks. Would it be OK to commit the patch after 
> > > > > > > > bootstrap+test ?
> > > > > > >
> > > > > > > Yes, but I'm confused because you say the patch doesn't work for 
> > > > > > > you?
> > > > > > The patch works for me to CSE erf/erfc pair.
> > > > > > However when there's only erfc in the source, it canonicalizes 
> > > > > > erfc(x)
> > > > > > to 1 - erf(x) but later fails to uncanonicalize 1 - erf(x) back to
> > > > > > erfc(x)
> > > > > > with -O3 -funsafe-math-optimizations.
> > > > > >
> > > > > > For,
> > > > > > t1 = __builtin_erfc(x),
> > > > > >
> > > > > > .optimized dump shows:
> > > > > >   _2 = __builtin_erf (x_1(D));
> > > > > >   t1_3 = 1.0e+0 - _2;
> > > > > >
> > > > > > and for,
> > > > > > double t1 = x + __builtin_erfc(x);
> > > > > >
> > > > > > .optimized dump shows:
> > > > > >   _3 = __builtin_erf (x_2(D));
> > > > > >   _7 = x_2(D) + 1.0e+0;
> > > > > >   t1_4 = _7 - _3;
> > > > > >
> > > > > > I assume in both cases, we want erfc in the code-gen instead ?
> > > > > > I think the reason uncaonicalization fails is because the pattern 1 
> > > > > > -
> > > > > > erf(x) to erfc(x)
> > > > > > gets applied, but then it fails in resimplifying erfc(x), and we end
> > > > > > up with 1 - erf(x) in code-gen.
> > > > > >
> > > > > > From gimple-match.c, it hits the simplification:
> > > > > >
> > > > > >                                 gimple_seq *lseq = seq;
> > > > > >                                 if (__builtin_expect (!dbg_cnt
> > > > > > (match), 0)) goto next_after_fail1172;
> > > > > >                                 if (__builtin_expect (dump_file &&
> > > > > > (dump_flags & TDF_FOLDING), 0)) fprintf (dump_file, "Applying 
> > > > > > pattern
> > > > > > %s:%d, %s:%d\n", "match.pd", 6162, __FILE__, __LINE__);
> > > > > >                                 {
> > > > > >                                   res_op->set_op 
> > > > > > (CFN_BUILT_IN_ERFC, type, 1);
> > > > > >                                   res_op->ops[0] = captures[0];
> > > > > >                                   res_op->resimplify (lseq, 
> > > > > > valueize);
> > > > > >                                   return true;
> > > > > >                                 }
> > > > > >
> > > > > > But res_op->resimplify returns false, and doesn't end up adding to 
> > > > > > lseq.
> > > > >
> > > > > There's nothing to add to lseq since there's also nothing to 
> > > > > resimplify.
> > > > > The only thing that could happen is that the replacement is not done
> > > > > because replace_stmt_with_simplification via maybe_push_res_to_seq
> > > > > doesn't pass the builtin_decl_implicit test:
> > > > >
> > > > >           /* Find the function we want to call.  */
> > > > >           tree decl = builtin_decl_implicit (as_builtin_fn (fn));
> > > > >           if (!decl)
> > > > >             return NULL;
> > > > >
> > > > > btw, it did work for me since the call was present before and 
> > > > > gimplification
> > > > > should then mark the function eligible for implicit generation.
> > > > >
> > > > > > As you suggest, should we instead handle this in fre to transform
> > > > > > erfc(x) to 1 - erf(x), only when
> > > > > > there's a matching erf(x) in the source ?
> > > > >
> > > > > Note that's strictly less powerful and we'd have to handle erf(x) -> 
> > > > > 1 +erfc (x)
> > > > > to handle CSE in
> > > > >
> > > > >   tem = erfc (x);
> > > > >   tem2 = erf (x);
> > > > >
> > > > > So no, I think the canonicalization is fine unless there's a 
> > > > > compelling reason
> > > > > for having both erfc and erf.
> > > > >
> > > > > Can you debug why the reverse transform doesn't work for you?
> > > > It seems the issue was that erfc wasn't getting marked with const
> > > > attribute, and failed the following test in
> > > > maybe_push_res_to_seq:
> > > >           /* We can't and should not emit calls to non-const functions. 
> > > >  */
> > > >           if (!(flags_from_decl_or_type (decl) & ECF_CONST))
> > > >             return NULL;
> > > >
> > > > Passing -fno-math-errno seems to work for the reverse transform:
> > > >
> > > > double f(double x)
> > > > {
> > > >   double g(double, double);
> > > >
> > > >   double t1 = __builtin_erfc (x);
> > > >   return t1;
> > > > }
> > > >
> > > > Compiling with -O3 -funsafe-math-optimizations -fno-math-errno:
> > > >
> > > > vrp2 dump shows:
> > > > Folding statement: _2 = __builtin_erf (x_1(D));
> > > > Not folded
> > > > Folding statement: t1_3 = 1.0e+0 - _2;
> > > > Applying pattern match.pd:6162, gimple-match.c:68450
> > > > gimple_simplified to t1_3 = __builtin_erfc (x_1(D));
> > > > Folded into: t1_3 = __builtin_erfc (x_1(D));
> > > >
> > > > and .optimized dump shows:
> > > > double f (double x)
> > > > {
> > > >   double t1;
> > > >
> > > >   <bb 2> [local count: 1073741824]:
> > > >   t1_3 = __builtin_erfc (x_1(D)); [tail call]
> > > >   return t1_3;
> > > > }
> > > >
> > > > Unfortunately, for the test-case involving erf/erfc pair, the reverse
> > > > transform seems to undo the CSE:
> > > >
> > > > double f(double x)
> > > > {
> > > >   double g(double, double);
> > > >
> > > >   double t1 = __builtin_erf (x);
> > > >   double t2 = __builtin_erfc (x);
> > > >   return g(t1, t2);
> > > > }
> > > >
> > > > gimplification turns erfc to 1 - erf:
> > > > Applying pattern match.pd:6158, gimple-match.c:44479
> > > > gimple_simplified to D.1988 = __builtin_erf (x);
> > > > t2 = 1.0e+0 - D.1988;
> > > >
> > > >   t1 = __builtin_erf (x);
> > > >   D.1988 = __builtin_erf (x);
> > > >   t2 = 1.0e+0 - D.1988;
> > > >   D.1987 = g (t1, t2);
> > > >
> > > > fre1 does the CSE:
> > > >   t1_2 = __builtin_erf (x_1(D));
> > > >   t2_4 = 1.0e+0 - t1_2;
> > > >   _7 = g (t1_2, t2_4);
> > > >
> > > > and forwprop4 again converts 1 - erf(x) to erfc(x), "undoing" the CSE:
> > > > Applying pattern match.pd:6162, gimple-match.c:68450
> > > > gimple_simplified to t2_3 = __builtin_erfc (x_1(D));
> > > >
> > > >   t1_2 = __builtin_erf (x_1(D));
> > > >   t2_3 = __builtin_erfc (x_1(D));
> > > >   _6 = g (t1_2, t2_3);
> > > >
> > > > and .optimized dump shows:
> > > >   t1_2 = __builtin_erf (x_1(D));
> > > >   t2_3 = __builtin_erfc (x_1(D));
> > > >   _6 = g (t1_2, t2_3); [tail call]
> > >
> > > You probably want an explicit && single_use () check on the
> > > 1 - erf() -> erfc transform.
> > single_use worked, thanks!
> > As you pointed out, we reassociate x + erfc(x) to (x + 1) - erf(x) and
> > don't uncanonicalize back.
> > I added another pattern to reassociate (x + 1) - erf(x) to x + (1 -
> > erf(x)), after which,
> > it gets uncanonicalized back to x + erfc(x) in .optimized dump.
>
> I think that's not the way to "fix" this case, instead if we'd care
> we should change reassoc to either associate a -erf() operand and
> a 1 next to each other or recognize it there.  But I think that
> can be done as followup.
>
> > Does the attached patch look OK after bootstrap+test ?
>
> It's OK with removing the extra (x+1) - erf pattern and the
> associated testcase.
Thanks. It regressed builtin-nonneg-1.c because for following case:

void test(double d)
{
  if (signbit (__builtin_erfc (d)))
    link_failure_erfc ();
}

before patch signbit (erfc (d)) was transformed to 0, since erfc
returns a non-negative result.
However after patch, erfc (d), transformed to 1 - erf(d), and in
optimized dump we get:

  <bb 2> [local count: 1073741824]:
  _3 = __builtin_erf (d_2(D));
  if (_3 > 1.0e+0)
    goto <bb 3>; [41.48%]
  else
    goto <bb 4>; [58.52%]

  <bb 3> [local count: 445388112]:
  link_failure_erfc (); [tail call]

  <bb 4> [local count: 1073741824]:
  return;

The attached patch adds transform erf(x) > 1 --> 0 which removes the
redundant test and call to link_failure_erfc.
I assume the transform is OK since erf's range is [-1, 1] ?

Thanks,
Prathamesh
>
> Richard.
>
> > Thanks,
> > Prathamesh
> > >
> > > > Thanks,
> > > > Prathamesh
> > > > >
> > > > > Richard.
> > > > >
> > > > > > Thanks,
> > > > > > Prathamesh
> > > > > > >
> > > > > > > Btw, please add the testcase from the PR and also a testcase that 
> > > > > > > shows
> > > > > > > the canonicalization is undone.  Maybe you can also double-check 
> > > > > > > that
> > > > > > > we handle x + erfc (x) because I see we associate that as
> > > > > > > (x + 1) - erf (x) which is then not recognized back to erfc.
> > > > > > >
> > > > > > > The less surprising (as to preserve the function called in the 
> > > > > > > source)
> > > > > > > variant for the PR would be to teach CSE to lookup erf(x) when
> > > > > > > visiting erfc(x) and when found synthesize 1 - erf(x).
> > > > > > >
> > > > > > > That said, a mathematician should chime in on how important it is
> > > > > > > to preserve erfc vs. erf (precision or even speed).
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Richard.
> > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Prathamesh
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Richard.
> > > > > > > > >
> > > > > > > > > > So for the following test:
> > > > > > > > > > double f(double x)
> > > > > > > > > > {
> > > > > > > > > >   t1 = __builtin_erfc(x)
> > > > > > > > > >   return t1;
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > .optimized dump shows:
> > > > > > > > > > double f (double x)
> > > > > > > > > > {
> > > > > > > > > >   double t1;
> > > > > > > > > >   double _2;
> > > > > > > > > >
> > > > > > > > > >   <bb 2> [local count: 1073741824]:
> > > > > > > > > >   _2 = __builtin_erf (x_1(D));
> > > > > > > > > >   t1_3 = 1.0e+0 - _2;
> > > > > > > > > >   return t1_3;
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > while before patch, it has:
> > > > > > > > > >   t1_4 = __builtin_erfc (x_2(D)); [tail call]
> > > > > > > > > >   return t1_4;
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Prathamesh
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Richard.
> > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Prathamesh
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Richard Biener <rguent...@suse.de>
> > > > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 
> > > > > > > > > > > 90409 Nuernberg,
> > > > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Richard Biener <rguent...@suse.de>
> > > > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 
> > > > > > > > > Nuernberg,
> > > > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Richard Biener <rguent...@suse.de>
> > > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 
> > > > > > > Nuernberg,
> > > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > >
> > >
> > > --
> > > Richard Biener <rguent...@suse.de>
> > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> >
>
> --
> Richard Biener <rguent...@suse.de>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
diff --git a/gcc/match.pd b/gcc/match.pd
index a9791ceb74a..80472631e00 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6147,6 +6147,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
    (floors tree_expr_nonnegative_p@0)
    (truncs @0))))
 
+/* Simplify,
+   erfc(x) -> 1 - erf(x) if canonicalize_math_p().
+   1 - erf(x) -> erfc(x) if !canonicalize_math_p().  */
+
+(if (flag_unsafe_math_optimizations)
+ (simplify
+  (ERFC @0)
+   (if (canonicalize_math_p ())
+    (minus { build_one_cst (TREE_TYPE (@0)); } (ERF @0))))
+ (simplify
+  (minus real_onep (ERF@1 @0))
+  (if (!canonicalize_math_p () && single_use (@1))
+   (ERFC @0)))
+ /* Simplify erf(x) > 1 to 0.  */
+ (simplify
+  (gt (ERF @0) real_onep)
+  { boolean_false_node; }))
+
 (match double_value_p
  @0
  (if (TYPE_MAIN_VARIANT (TREE_TYPE (@0)) == double_type_node)))
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83750-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83750-1.c
new file mode 100644
index 00000000000..a1d456c0119
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83750-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized -funsafe-math-optimizations 
-fno-math-errno" } */
+
+double f(double x)
+{
+  double g (double, double);
+  double t1 = __builtin_erf (x);
+  double t2 = __builtin_erfc (x);
+  return g (t1, t2);
+}
+
+/* { dg-final { scan-tree-dump-not "__builtin_erfc" "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83750-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83750-2.c
new file mode 100644
index 00000000000..a55825bca47
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83750-2.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized -funsafe-math-optimizations 
-fno-math-errno" } */
+
+double f(double x)
+{
+  return __builtin_erfc (x); 
+}
+
+/* { dg-final { scan-tree-dump-not "__builtin_erf \\(" "optimized"} } */

Reply via email to