On Fri, Sep 20, 2024 at 06:39:52PM -0400, Jason Merrill wrote:
> On 9/20/24 12:18 AM, Marek Polacek wrote:
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > -- >8 --
> > This PR reports a missed optimization.  When we have:
> > 
> >    Str str{"Test"};
> >    callback(str);
> > 
> > as in the test, we're able to evaluate the Str::Str() call at compile
> > time.  But when we have:
> > 
> >    callback(Str{"Test"});
> > 
> > we are not.  With this patch (in fact, it's Patrick's patch with a little
> > tweak), we turn
> > 
> >    callback (TARGET_EXPR <D.2890, <<< Unknown tree: aggr_init_expr
> >      5
> >      __ct_comp
> >      D.2890
> >      (struct Str *) <<< Unknown tree: void_cst >>>
> >      (const char *) "Test" >>>>)
> > 
> > into
> > 
> >    callback (TARGET_EXPR <D.2890, {.str=(const char *) "Test", .length=4}>)
> > 
> > I explored the idea of calling maybe_constant_value for the whole
> > TARGET_EXPR in cp_fold.  That has three problems:
> > - we can't always elide a TARGET_EXPR, so we'd have to make sure the
> >    result is also a TARGET_EXPR;
> 
> I'd think that the result should always be a TARGET_EXPR for a class, and
> that's the case we want to fold; a TARGET_EXPR for a scalar is always the
> initialize-temp-and-use-it pattern you mention below.

Checking CLASS_TYPE_P would solve some of the problems, yes.  But...
 
> > - the resulting TARGET_EXPR must have the same flags, otherwise Bad
> >    Things happen;
> 
> I guess maybe_constant_value should be fixed to preserve flags regardless of
> this change.

Yeah, cxx_eval_outermost_constant_expr already preserves TARGET_EXPR flags,
but here we go into the break_out_target_exprs block in maybe_constant_value
and that doesn't necessarily preserve them.

> > - getting a new slot is also problematic.  I've seen a test where we
> >    had "TARGET_EXPR<D.2680, ...>, D.2680", and folding the whole TARGET_EXPR
> >    would get us "TARGET_EXPR<D.2681, ...>", but since we don't see the outer
> >    D.2680, we can't replace it with D.2681, and things break.
> 
> Hmm, yeah.  Maybe only if TARGET_EXPR_IMPLICIT_P?

...unfortunately that doesn't always help.  I've reduced an example into:

  struct optional {
    constexpr optional(int) {}
  };
  optional foo() { return 2; }

where check_return_expr creates a COMPOUND_EXPR:

        retval = build2 (COMPOUND_EXPR, TREE_TYPE (retval), retval,
                         TREE_OPERAND (retval, 0));

where the TARGET_EXPR comes from build_cplus_new so it is _IMPLICIT_P.
 
> > With this patch, two tree-ssa tests regressed: pr78687.C and pr90883.C.
> > 
> > FAIL: g++.dg/tree-ssa/pr90883.C   scan-tree-dump dse1 "Deleted redundant 
> > store: .*.a = {}"
> > is easy.  Previously, we would call C::C, so .gimple has:
> > 
> >    D.2590 = {};
> >    C::C (&D.2590);
> >    D.2597 = D.2590;
> >    return D.2597;
> > 
> > Then .einline inlines the C::C call:
> > 
> >    D.2590 = {};
> >    D.2590.a = {}; // #1
> >    D.2590.b = 0;  // #2
> >    D.2597 = D.2590;
> >    D.2590 ={v} {CLOBBER(eos)};
> >    return D.2597;
> > 
> > then #2 is removed in .fre1, and #1 is removed in .dse1.  So the test
> > passes.  But with the patch, .gimple won't have that C::C call, so the
> > IL is of course going to look different.
> 
> Maybe -fno-inline instead of the --param?

Then that C::C call isn't inlined and the test fails :/.
 
> > Unfortunately, pr78687.C is much more complicated and I can't explain
> > precisely what happens there.  But it seems like a good idea to have
> > a way to avoid this optimization.  So I've added the "noinline" check.
> 
> Hmm, I'm surprised make_object_1 would be affected, since the ref_proxy
> constructors are not constexpr.  And I wouldn't expect the optimization to
> affect the value-initialization option_2().

In pr78687.C we do this new optimization only once for
"constexpr eggs::variants::variant<Ts>::variant(U&&) noexcept 
(std::is_nothrow_constructible<T, U&&>::value)".
 
> >     PR c++/116416
> > 
> > gcc/cp/ChangeLog:
> > 
> >     * cp-gimplify.cc (cp_fold_r) <case TARGET_EXPR>: Try to fold
> >     TARGET_EXPR_INITIAL and replace it with the folded result if
> >     it's TREE_CONSTANT.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> >     * g++.dg/analyzer/pr97116.C: Adjust dg-message.
> >     * g++.dg/cpp2a/consteval-prop2.C: Adjust dg-bogus.
> >     * g++.dg/tree-ssa/pr78687.C: Add __attribute__((noinline)).
> >     * g++.dg/tree-ssa/pr90883.C: Likewise.
> >     * g++.dg/cpp1y/constexpr-prvalue1.C: New test.
> > 
> > Co-authored-by: Patrick Palka <ppa...@redhat.com>
> > ---
> >   gcc/cp/cp-gimplify.cc                         | 14 +++++++++
> >   gcc/testsuite/g++.dg/analyzer/pr97116.C       |  2 +-
> >   .../g++.dg/cpp1y/constexpr-prvalue1.C         | 29 +++++++++++++++++++
> >   gcc/testsuite/g++.dg/cpp2a/consteval-prop2.C  |  2 +-
> >   gcc/testsuite/g++.dg/tree-ssa/pr78687.C       |  5 +++-
> >   gcc/testsuite/g++.dg/tree-ssa/pr90883.C       |  1 +
> >   6 files changed, 50 insertions(+), 3 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-prvalue1.C
> > 
> > diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
> > index 003e68f1ea7..41d6333f650 100644
> > --- a/gcc/cp/cp-gimplify.cc
> > +++ b/gcc/cp/cp-gimplify.cc
> > @@ -1473,6 +1473,20 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void 
> > *data_)
> >      that case, strip it in favor of this one.  */
> >         if (tree &init = TARGET_EXPR_INITIAL (stmt))
> >     {
> > +     tree fn;
> > +     if ((data->flags & ff_genericize)
> > +         /* Give the user an option to opt out.  */
> > +         && !((fn = current_function_decl)
> > +              && lookup_attribute ("noinline",
> > +                                   DECL_ATTRIBUTES (fn))))
> 
> This looks backwards from the usual sense of noinline, which suppresses
> inlining of the function so marked, rather than inlining of other functions
> called.  If you want to check noinline, you need to dig out the called
> function (if any), perhaps with extract_call_expr.

Now I wish I hadn't done that.  But for implicit functions, like in
pr90883.C where we have

    class C
    {
        char a[7]{};
        int b{};
    };
 
there's nothing to make noinline anyway.  So I'm stuck.  I suppose
it wouldn't make sense to use -fno-fold-simple-inlines to disable
this optimization.  Can I abuse -fearly-inlining?  I don't want a new
flag for this :(.

Or should we just adjust the tests?

> This should also share conditions with the code that calls
> maybe_constant_value for CALL_EXPR (flag_no_inline, in particular).

Done.  So I have 

--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -1473,6 +1473,16 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void *data_)
     that case, strip it in favor of this one.  */
       if (tree &init = TARGET_EXPR_INITIAL (stmt))
    {
+     if ((data->flags & ff_genericize)
+         && !flag_no_inline)
+       {
+         tree folded = maybe_constant_init (init, TARGET_EXPR_SLOT (stmt));
+         if (folded != init && TREE_CONSTANT (folded))
+       {
+         init = folded;
+         break;
+       }
+       }
      cp_walk_tree (&init, cp_fold_r, data, NULL);
      cp_walk_tree (&TARGET_EXPR_CLEANUP (stmt), cp_fold_r, data, NULL);
      *walk_subtrees = 0;

But there are still these two fails as mentioned above:

FAIL: g++.dg/tree-ssa/pr78687.C   scan-tree-dump sra "Removing load:.*ptr;"
FAIL: g++.dg/tree-ssa/pr90883.C   scan-tree-dump dse1 "Deleted redundant store: 
.*.a = {}"

Reply via email to