On 9/20/24 12:18 AM, Marek Polacek wrote:
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This PR reports a missed optimization.  When we have:

   Str str{"Test"};
   callback(str);

as in the test, we're able to evaluate the Str::Str() call at compile
time.  But when we have:

   callback(Str{"Test"});

we are not.  With this patch (in fact, it's Patrick's patch with a little
tweak), we turn

   callback (TARGET_EXPR <D.2890, <<< Unknown tree: aggr_init_expr
     5
     __ct_comp
     D.2890
     (struct Str *) <<< Unknown tree: void_cst >>>
     (const char *) "Test" >>>>)

into

   callback (TARGET_EXPR <D.2890, {.str=(const char *) "Test", .length=4}>)

I explored the idea of calling maybe_constant_value for the whole
TARGET_EXPR in cp_fold.  That has three problems:
- we can't always elide a TARGET_EXPR, so we'd have to make sure the
   result is also a TARGET_EXPR;

I'd think that the result should always be a TARGET_EXPR for a class, and that's the case we want to fold; a TARGET_EXPR for a scalar is always the initialize-temp-and-use-it pattern you mention below.

- the resulting TARGET_EXPR must have the same flags, otherwise Bad
   Things happen;

I guess maybe_constant_value should be fixed to preserve flags regardless of this change.

- getting a new slot is also problematic.  I've seen a test where we
   had "TARGET_EXPR<D.2680, ...>, D.2680", and folding the whole TARGET_EXPR
   would get us "TARGET_EXPR<D.2681, ...>", but since we don't see the outer
   D.2680, we can't replace it with D.2681, and things break.

Hmm, yeah.  Maybe only if TARGET_EXPR_IMPLICIT_P?

With this patch, two tree-ssa tests regressed: pr78687.C and pr90883.C.

FAIL: g++.dg/tree-ssa/pr90883.C   scan-tree-dump dse1 "Deleted redundant store: .*.a 
= {}"
is easy.  Previously, we would call C::C, so .gimple has:

   D.2590 = {};
   C::C (&D.2590);
   D.2597 = D.2590;
   return D.2597;

Then .einline inlines the C::C call:

   D.2590 = {};
   D.2590.a = {}; // #1
   D.2590.b = 0;  // #2
   D.2597 = D.2590;
   D.2590 ={v} {CLOBBER(eos)};
   return D.2597;

then #2 is removed in .fre1, and #1 is removed in .dse1.  So the test
passes.  But with the patch, .gimple won't have that C::C call, so the
IL is of course going to look different.

Maybe -fno-inline instead of the --param?

Unfortunately, pr78687.C is much more complicated and I can't explain
precisely what happens there.  But it seems like a good idea to have
a way to avoid this optimization.  So I've added the "noinline" check.

Hmm, I'm surprised make_object_1 would be affected, since the ref_proxy constructors are not constexpr. And I wouldn't expect the optimization to affect the value-initialization option_2().

        PR c++/116416

gcc/cp/ChangeLog:

        * cp-gimplify.cc (cp_fold_r) <case TARGET_EXPR>: Try to fold
        TARGET_EXPR_INITIAL and replace it with the folded result if
        it's TREE_CONSTANT.

gcc/testsuite/ChangeLog:

        * g++.dg/analyzer/pr97116.C: Adjust dg-message.
        * g++.dg/cpp2a/consteval-prop2.C: Adjust dg-bogus.
        * g++.dg/tree-ssa/pr78687.C: Add __attribute__((noinline)).
        * g++.dg/tree-ssa/pr90883.C: Likewise.
        * g++.dg/cpp1y/constexpr-prvalue1.C: New test.

Co-authored-by: Patrick Palka <ppa...@redhat.com>
---
  gcc/cp/cp-gimplify.cc                         | 14 +++++++++
  gcc/testsuite/g++.dg/analyzer/pr97116.C       |  2 +-
  .../g++.dg/cpp1y/constexpr-prvalue1.C         | 29 +++++++++++++++++++
  gcc/testsuite/g++.dg/cpp2a/consteval-prop2.C  |  2 +-
  gcc/testsuite/g++.dg/tree-ssa/pr78687.C       |  5 +++-
  gcc/testsuite/g++.dg/tree-ssa/pr90883.C       |  1 +
  6 files changed, 50 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-prvalue1.C

diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index 003e68f1ea7..41d6333f650 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -1473,6 +1473,20 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void *data_)
         that case, strip it in favor of this one.  */
        if (tree &init = TARGET_EXPR_INITIAL (stmt))
        {
+         tree fn;
+         if ((data->flags & ff_genericize)
+             /* Give the user an option to opt out.  */
+             && !((fn = current_function_decl)
+                  && lookup_attribute ("noinline",
+                                       DECL_ATTRIBUTES (fn))))

This looks backwards from the usual sense of noinline, which suppresses inlining of the function so marked, rather than inlining of other functions called. If you want to check noinline, you need to dig out the called function (if any), perhaps with extract_call_expr.

This should also share conditions with the code that calls maybe_constant_value for CALL_EXPR (flag_no_inline, in particular).

Jason

Reply via email to