On 9/20/24 12:18 AM, Marek Polacek wrote:
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
-- >8 --
This PR reports a missed optimization. When we have:
Str str{"Test"};
callback(str);
as in the test, we're able to evaluate the Str::Str() call at compile
time. But when we have:
callback(Str{"Test"});
we are not. With this patch (in fact, it's Patrick's patch with a little
tweak), we turn
callback (TARGET_EXPR <D.2890, <<< Unknown tree: aggr_init_expr
5
__ct_comp
D.2890
(struct Str *) <<< Unknown tree: void_cst >>>
(const char *) "Test" >>>>)
into
callback (TARGET_EXPR <D.2890, {.str=(const char *) "Test", .length=4}>)
I explored the idea of calling maybe_constant_value for the whole
TARGET_EXPR in cp_fold. That has three problems:
- we can't always elide a TARGET_EXPR, so we'd have to make sure the
result is also a TARGET_EXPR;
I'd think that the result should always be a TARGET_EXPR for a class,
and that's the case we want to fold; a TARGET_EXPR for a scalar is
always the initialize-temp-and-use-it pattern you mention below.
- the resulting TARGET_EXPR must have the same flags, otherwise Bad
Things happen;
I guess maybe_constant_value should be fixed to preserve flags
regardless of this change.
- getting a new slot is also problematic. I've seen a test where we
had "TARGET_EXPR<D.2680, ...>, D.2680", and folding the whole TARGET_EXPR
would get us "TARGET_EXPR<D.2681, ...>", but since we don't see the outer
D.2680, we can't replace it with D.2681, and things break.
Hmm, yeah. Maybe only if TARGET_EXPR_IMPLICIT_P?
With this patch, two tree-ssa tests regressed: pr78687.C and pr90883.C.
FAIL: g++.dg/tree-ssa/pr90883.C scan-tree-dump dse1 "Deleted redundant store: .*.a
= {}"
is easy. Previously, we would call C::C, so .gimple has:
D.2590 = {};
C::C (&D.2590);
D.2597 = D.2590;
return D.2597;
Then .einline inlines the C::C call:
D.2590 = {};
D.2590.a = {}; // #1
D.2590.b = 0; // #2
D.2597 = D.2590;
D.2590 ={v} {CLOBBER(eos)};
return D.2597;
then #2 is removed in .fre1, and #1 is removed in .dse1. So the test
passes. But with the patch, .gimple won't have that C::C call, so the
IL is of course going to look different.
Maybe -fno-inline instead of the --param?
Unfortunately, pr78687.C is much more complicated and I can't explain
precisely what happens there. But it seems like a good idea to have
a way to avoid this optimization. So I've added the "noinline" check.
Hmm, I'm surprised make_object_1 would be affected, since the ref_proxy
constructors are not constexpr. And I wouldn't expect the optimization
to affect the value-initialization option_2().
PR c++/116416
gcc/cp/ChangeLog:
* cp-gimplify.cc (cp_fold_r) <case TARGET_EXPR>: Try to fold
TARGET_EXPR_INITIAL and replace it with the folded result if
it's TREE_CONSTANT.
gcc/testsuite/ChangeLog:
* g++.dg/analyzer/pr97116.C: Adjust dg-message.
* g++.dg/cpp2a/consteval-prop2.C: Adjust dg-bogus.
* g++.dg/tree-ssa/pr78687.C: Add __attribute__((noinline)).
* g++.dg/tree-ssa/pr90883.C: Likewise.
* g++.dg/cpp1y/constexpr-prvalue1.C: New test.
Co-authored-by: Patrick Palka <ppa...@redhat.com>
---
gcc/cp/cp-gimplify.cc | 14 +++++++++
gcc/testsuite/g++.dg/analyzer/pr97116.C | 2 +-
.../g++.dg/cpp1y/constexpr-prvalue1.C | 29 +++++++++++++++++++
gcc/testsuite/g++.dg/cpp2a/consteval-prop2.C | 2 +-
gcc/testsuite/g++.dg/tree-ssa/pr78687.C | 5 +++-
gcc/testsuite/g++.dg/tree-ssa/pr90883.C | 1 +
6 files changed, 50 insertions(+), 3 deletions(-)
create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-prvalue1.C
diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index 003e68f1ea7..41d6333f650 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -1473,6 +1473,20 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void *data_)
that case, strip it in favor of this one. */
if (tree &init = TARGET_EXPR_INITIAL (stmt))
{
+ tree fn;
+ if ((data->flags & ff_genericize)
+ /* Give the user an option to opt out. */
+ && !((fn = current_function_decl)
+ && lookup_attribute ("noinline",
+ DECL_ATTRIBUTES (fn))))
This looks backwards from the usual sense of noinline, which suppresses
inlining of the function so marked, rather than inlining of other
functions called. If you want to check noinline, you need to dig out
the called function (if any), perhaps with extract_call_expr.
This should also share conditions with the code that calls
maybe_constant_value for CALL_EXPR (flag_no_inline, in particular).
Jason