On 7/7/22 11:16, Patrick Palka wrote:
On Thu, 7 Jul 2022, Jason Merrill wrote:
On 7/6/22 15:26, Patrick Palka wrote:
On Tue, 5 Jul 2022, Jason Merrill wrote:
On 7/5/22 10:06, Patrick Palka wrote:
On Fri, 1 Jul 2022, Jason Merrill wrote:
On 6/29/22 13:42, Patrick Palka wrote:
In r13-1045-gcb7fd1ea85feea I assumed that substitution into generic
DECL_TI_ARGS corresponds to an identity mapping of the given
arguments,
and hence its safe to always elide such substitution. But this PR
demonstrates that such a substitution isn't always the identity
mapping,
in particular when there's an ARGUMENT_PACK_SELECT argument, which
gets
handled specially during substitution:
* when substituting an APS into a template parameter, we strip
the
APS to its underlying argument;
* and when substituting an APS into a pack expansion, we strip
the
APS to its underlying argument pack.
Ah, right. For instance, in variadic96.C we have
10 template < typename... T >
11 struct derived
12 : public base< T, derived< T... > >...
so when substituting into the base-specifier, we're approaching it
from
the
outside in, so when we get to the inner T... we need some way to find
the
T
pack again. It might be possible to remove the need for APS by
substituting
inner pack expansions before outer ones, which could improve
worst-case
complexity, but I don't know how relevant that is in real code; I
imagine
most
inner pack expansions are as simple as this one.
Aha, that makes sense.
In this testcase, when expanding the pack expansion pattern (idx +
Ns)...
with Ns={0,1}, we specialize idx twice, first with Ns=APS<0,{0,1}>
and
then Ns=APS<1,{0,1}>. The DECL_TI_ARGS of idx are the generic
template
arguments of the enclosing class template impl, so before r13-1045,
we'd substitute into its DECL_TI_ARGS which gave Ns={0,1} as
desired.
But after r13-1045, we elide this substitution and end up attempting
to
hash the original Ns argument, an APS, which ICEs.
So this patch partially reverts this part of r13-1045. I considered
using preserve_args in this case instead, but that'd break the
static_assert in the testcase because preserve_args always strips
APS to
its underlying argument, but here we want to strip it to its
underlying
argument pack, so we'd incorrectly end up forming the
specializations
impl<0>::idx and impl<1>::idx instead of impl<0,1>::idx.
Although we can't elide the substitution into DECL_TI_ARGS in light
of
ARGUMENT_PACK_SELECT, it should still be safe to elide template
argument
coercion in the case of a non-template decl, which this patch
preserves.
It's unfortunate that we need to remove this optimization just
because
it doesn't hold for one special tree code. So this patch implements
a
heuristic in tsubst_template_args to avoid allocating a new TREE_VEC
if
the substituted elements are identical to those of a level from
ARGS.
It turns out that about 30% of all calls to tsubst_template_args
benefit
from this optimization, and it reduces memory usage by about 1.5%
for
e.g. stdc++.h (relative to r13-1045). (This is the maybe_reuse
stuff,
the rest of the changes to tsubst_template_args are just drive-by
cleanups.)
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for
trunk? Patch generated with -w to ignore noisy whitespace changes.
PR c++/105956
gcc/cp/ChangeLog:
* pt.cc (tsubst_template_args): Move variable declarations
closer to their first use. Replace 'orig_t' with 'r'. Rename
'need_new' to 'const_subst_p'. Heuristically detect if the
substituted elements are identical to that of a level from
'args' and avoid allocating a new TREE_VEC if so.
(tsubst_decl) <case TYPE_DECL, case VAR_DECL>: Revert
r13-1045-gcb7fd1ea85feea change for avoiding substitution into
DECL_TI_ARGS, but still avoid coercion in this case.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/variadic183.C: New test.
---
gcc/cp/pt.cc | 113
++++++++++++++---------
gcc/testsuite/g++.dg/cpp0x/variadic183.C | 14 +++
2 files changed, 85 insertions(+), 42 deletions(-)
create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic183.C
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 8672da123f4..7898834faa6 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -27,6 +27,7 @@ along with GCC; see the file COPYING3. If not see
Fixed by: C++20 modules. */
#include "config.h"
+#define INCLUDE_ALGORITHM // for std::equal
#include "system.h"
#include "coretypes.h"
#include "cp-tree.h"
@@ -13544,17 +13545,22 @@ tsubst_argument_pack (tree orig_arg, tree
args,
tsubst_flags_t complain,
tree
tsubst_template_args (tree t, tree args, tsubst_flags_t
complain,
tree
in_decl)
{
- tree orig_t = t;
- int len, need_new = 0, i, expanded_len_adjust = 0, out;
- tree *elts;
-
if (t == error_mark_node)
return error_mark_node;
- len = TREE_VEC_LENGTH (t);
- elts = XALLOCAVEC (tree, len);
+ const int len = TREE_VEC_LENGTH (t);
+ tree *elts = XALLOCAVEC (tree, len);
+ int expanded_len_adjust = 0;
- for (i = 0; i < len; i++)
+ /* True iff no element of T was changed by the substitution. */
+ bool const_subst_p = true;
+
+ /* If MAYBE_REUSE is non-NULL, as an optimization we'll try to
reuse
and
+ return this TREE_VEC instead of allocating a new one if
possible.
This
+ will either be ARGS or a level from ARGS. */
+ tree maybe_reuse = NULL_TREE;
+
+ for (int i = 0; i < len; i++)
{
tree orig_arg = TREE_VEC_ELT (t, i);
tree new_arg;
@@ -13580,56 +13586,90 @@ tsubst_template_args (tree t, tree args,
tsubst_flags_t complain, tree in_decl)
else if (ARGUMENT_PACK_P (orig_arg))
new_arg = tsubst_argument_pack (orig_arg, args, complain,
in_decl);
else
+ {
new_arg = tsubst_template_arg (orig_arg, args, complain,
in_decl);
+ /* If T heuristically appears to be a set of generic
template
+ arguments, set MAYBE_REUSE to the corresponding level
from
+ ARGS. */
+ if (maybe_reuse == NULL_TREE && orig_arg != NULL_TREE)
+ {
+ if (TEMPLATE_PARM_P (orig_arg))
+ {
This doesn't handle the case of variadic template parameters, which
are
represented in the generic args with a pack expansion. If this is a
choice,
there should be a comment about it.
AFAICT substituting such a pack expansion will never be an identity
transform -- the relevant targ during tsubst_pack_expansion will be an
ARGUMENT_PACK, and we'll return the TREE_VEC from that ARGUMENT_PACK
instead of the ARGUMENT_PACK itself, so there's no way we can reuse (a
level from) 'args' in this case.
Hmm, when replacing a level of T... we start with a TREE_VEC containing an
ARGUMENT_PACK around a TREE_VEC containing a PACK_EXPANSION, and
substitution
should replace that with the argument level, a TREE_VEC containing an
ARGUMENT_PACK around a TREE_VEC containing an actual set of args, so reuse
ought to be possible?
Ah, I think see what you mean. For that, I think it suffices to just
handle in tsubst_template_args the case where we're substituting into a
single-elt TREE_VEC such as {T...}. That'll allow us to reuse the inner
TREE_VEC containing the actual set of args at least.
Sounds good.
And AFAICT that's
the best we can do, since we'll always be creating a new ARGUMENT_PACK
which doesn't appear in any other TREE_VEC we have at our disposal.
Why always? Why can't tsubst_argument_pack recognize the {T...} case and
return the corresponding ARGUMENT_PACK from args?
template <class... Ts> struct A
{
A<Ts...>* a;
};
A<int,char> a;
With your patch, the substitution into A<Ts...> reuses the inner TREE_VEC, but
why not reuse the ARGUMENT_PACK as well? And why not set maybe_reuse for the
enclosing arg level?
D'oh, you're totally right, I didn't notice this opportunity. Here's a
patch that makes tsubst_argument_pack reuse the ARGUMENT_PACK from
'args' when possible, and in turn makes tsubst_template_args reuse the
enclosing arg level. Relative to the previous patch, this increases
the % of calls to tsubst_template_args that benefit from this
optimization a further 10% (30% -> 40%), and reduces memory usage for
e.g. zip.cpp from range-v3 a further 2.5% (1.5% -> 4%).
-- >8 --
Subject: [PATCH] c++: generic targs and identity substitution [PR105956]
In r13-1045-gcb7fd1ea85feea I assumed that substitution into generic
DECL_TI_ARGS corresponds to an identity mapping of the given arguments,
and hence its safe to always elide such substitution. But this PR
demonstrates that such a substitution isn't always the identity mapping,
in particular when there's an ARGUMENT_PACK_SELECT argument, which gets
handled specially during substitution:
* when substituting an APS into a template parameter, we strip the
APS to its underlying argument;
* and when substituting an APS into a pack expansion, we strip the
APS to its underlying argument pack.
In this testcase, when expanding the pack expansion pattern (idx + Ns)...
with Ns={0,1}, we specialize idx twice, first with Ns=APS<0,{0,1}> and
then Ns=APS<1,{0,1}>. The DECL_TI_ARGS of idx are the generic template
arguments of the enclosing class template impl, so before r13-1045,
we'd substitute into its DECL_TI_ARGS which gave Ns={0,1} as desired.
But after r13-1045, we elide this substitution and end up attempting to
hash the original Ns argument, an APS, which ICEs.
So this patch reverts that part of r13-1045. I considered using
preserve_args in this case instead, but that'd break the static_assert
in the testcase because preserve_args always strips APS to its
underlying argument, but here we want to strip it to its underlying
argument pack, so we'd incorrectly end up forming the specializations
impl<0>::idx and impl<1>::idx instead of impl<0,1>::idx.
Although we can't elide the substitution into DECL_TI_ARGS in light of
ARGUMENT_PACK_SELECT, it should still be safe to elide template argument
coercion in the case of a non-template decl, which this patch preserves.
It's unfortunate that we need to remove this optimization just because
it doesn't hold for one special tree code. So this patch implements a
heuristic in tsubst_template_args to avoid allocating a new TREE_VEC if
the substituted elements are identical to those of a level from ARGS,
and a similar heuristic for tsubst_argument_pack. It turns out that
about 40% of all calls to tsubst_template_args benefit from this, and it
reduces memory usage by about 4% for e.g. zip.cpp from range-v3 (relative
to r13-1045) which more than makes up for the reversion.
PR c++/105956
gcc/cp/ChangeLog:
* pt.cc (template_arg_to_parm): Define.
(tsubst_argument_pack): Try to reuse the corresponding
ARGUMENT_PACK from 'args' when substituting into an
ARGUMENT_PACK for a generic template argument.
(tsubst_template_args): Move variable declarations
closer to their first use. Replace 'orig_t' with 'r'. Rename
'need_new' to 'const_subst_p'. Heuristically detect if the
substituted elements are identical to that of a level from
'args' and avoid allocating a new TREE_VEC if so. Assert that
'out' never exceeds the length of the new TREE_VEC, and remove
dead ARGUMENT_PACK_P test.
(tsubst_decl) <case TYPE_DECL, case VAR_DECL>: Revert
r13-1045-gcb7fd1ea85feea change for avoiding substitution into
DECL_TI_ARGS, but still avoid coercion in this case.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/variadic183.C: New test.
---
gcc/cp/pt.cc | 172 ++++++++++++++++-------
gcc/testsuite/g++.dg/cpp0x/variadic183.C | 14 ++
2 files changed, 138 insertions(+), 48 deletions(-)
create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic183.C
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 8672da123f4..85f3462a0ac 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -27,6 +27,7 @@ along with GCC; see the file COPYING3. If not see
Fixed by: C++20 modules. */
#include "config.h"
+#define INCLUDE_ALGORITHM // for std::equal
#include "system.h"
#include "coretypes.h"
#include "cp-tree.h"
@@ -4916,6 +4917,33 @@ template_parm_to_arg (tree t)
return t;
}
+/* If T looks like a generic template argument produced by template_parm_to_arg,
+ return the corresponding template parameter, otherwise return NULL_TREE. */
+
+static tree
+template_arg_to_parm (tree t)
+{
+ if (t == NULL_TREE)
+ return NULL_TREE;
+
+ if (ARGUMENT_PACK_P (t))
+ {
+ tree args = ARGUMENT_PACK_ARGS (t);
+ if (TREE_VEC_LENGTH (args) == 1
+ && PACK_EXPANSION_P (TREE_VEC_ELT (args, 0)))
+ {
+ t = PACK_EXPANSION_PATTERN (TREE_VEC_ELT (args, 0));
+ if (REFERENCE_REF_P (t))
+ t = TREE_OPERAND (t, 0);
+ }
+ }
+
+ if (TEMPLATE_PARM_P (t))
+ return t;
+
+ return NULL_TREE;
+}
+
/* Given a single level of template parameters (a TREE_VEC), return it
as a set of template arguments. */
@@ -13516,12 +13544,38 @@ tree
tsubst_argument_pack (tree orig_arg, tree args, tsubst_flags_t complain,
tree in_decl)
{
+ /* This flag is used only during deduction, and we don't expect to
+ see such ARGUMENT_PACKs during substitution. */
+ gcc_assert (!ARGUMENT_PACK_INCOMPLETE_P (orig_arg));
+
/* Substitute into each of the arguments. */
tree pack_args = tsubst_template_args (ARGUMENT_PACK_ARGS (orig_arg),
args, complain, in_decl);
- tree new_arg = error_mark_node;
- if (pack_args != error_mark_node)
+ if (pack_args == error_mark_node)
+ return error_mark_node;
+
+ if (pack_args == ARGUMENT_PACK_ARGS (orig_arg))
+ return orig_arg;
+
+ /* If we're substituting into an ARGUMENT_PACK for a generic template
+ argument, we might be able to avoid allocating a new ARGUMENT_PACK
+ by reusing the corresponding ARGUMENT_PACK from ARGS if the
+ substituted result is identical to it. */
+ if (tree parm = template_arg_to_parm (orig_arg))
{
+ int level, index;
+ template_parm_level_and_index (parm, &level, &index);
+ if (TMPL_ARGS_DEPTH (args) >= level)
+ if (tree arg = TMPL_ARG (args, level, index))
+ if (TREE_CODE (arg) == TREE_CODE (orig_arg)
+ && ARGUMENT_PACK_ARGS (arg) == pack_args)
+ {
+ gcc_assert (!ARGUMENT_PACK_INCOMPLETE_P (arg));
+ return arg;
+ }
+ }
+
+ tree new_arg;
if (TYPE_P (orig_arg))
{
new_arg = cxx_make_type (TREE_CODE (orig_arg));
@@ -13532,10 +13586,7 @@ tsubst_argument_pack (tree orig_arg, tree args,
tsubst_flags_t complain,
new_arg = make_node (TREE_CODE (orig_arg));
TREE_CONSTANT (new_arg) = TREE_CONSTANT (orig_arg);
}
-
ARGUMENT_PACK_ARGS (new_arg) = pack_args;
- }
-
return new_arg;
}
@@ -13544,17 +13595,22 @@ tsubst_argument_pack (tree orig_arg, tree args, tsubst_flags_t complain,
tree
tsubst_template_args (tree t, tree args, tsubst_flags_t complain, tree
in_decl)
{
- tree orig_t = t;
- int len, need_new = 0, i, expanded_len_adjust = 0, out;
- tree *elts;
-
if (t == error_mark_node)
return error_mark_node;
- len = TREE_VEC_LENGTH (t);
- elts = XALLOCAVEC (tree, len);
+ const int len = TREE_VEC_LENGTH (t);
+ tree *elts = XALLOCAVEC (tree, len);
+ int expanded_len_adjust = 0;
- for (i = 0; i < len; i++)
+ /* True iff the substituted result is identical to T. */
+ bool const_subst_p = true;
+
+ /* If MAYBE_REUSE is a TREE_VEC, as an optimization we'll try to reuse and
+ return this TREE_VEC instead of allocating a new one if possible. This
+ will either be ARGS or a level from ARGS. */
+ tree maybe_reuse = NULL_TREE;
Looks like we can move this declaration further down now.
+ for (int i = 0; i < len; i++)
{
tree orig_arg = TREE_VEC_ELT (t, i);
tree new_arg;
@@ -13587,49 +13643,80 @@ tsubst_template_args (tree t, tree args,
tsubst_flags_t complain, tree in_decl)
elts[i] = new_arg;
if (new_arg != orig_arg)
- need_new = 1;
+ const_subst_p = false;
}
- if (!need_new)
+ if (const_subst_p)
return t;
+ /* If T appears to be a set of generic template arguments, set
+ MAYBE_REUSE to the corresponding level from ARGS. */
+ if (tree parm = template_arg_to_parm (TREE_VEC_ELT (t, 0)))
+ {
+ int level, index;
+ template_parm_level_and_index (parm, &level, &index);
+ if (index == 0 && TMPL_ARGS_DEPTH (args) >= level)
+ maybe_reuse = TMPL_ARGS_LEVEL (args, level);
+ }
+ /* If ARGS and T are both multi-level, the substituted result may be
+ identical to ARGS (if each level was pairwise identical). */
+ else if (TMPL_ARGS_HAVE_MULTIPLE_LEVELS (t)
+ && TMPL_ARGS_HAVE_MULTIPLE_LEVELS (args))
Maybe compare TMPL_ARGS_DEPTH?
OK with or without these tweaks.
+ maybe_reuse = args;
+
+ /* Return MAYBE_REUSE and avoid allocating a new TREE_VEC if the substituted
+ result is identical to it. */
+ if (maybe_reuse != NULL_TREE
+ && TREE_VEC_LENGTH (maybe_reuse) == len
+ && std::equal (elts, elts+len, TREE_VEC_BEGIN (maybe_reuse)))
+ return maybe_reuse;
+ /* If T consists of only a pack expansion for which substitution yielded
+ a TREE_VEC of the expanded elements, then reuse that TREE_VEC instead
+ of effectively making a copy. */
+ if (len == 1
+ && PACK_EXPANSION_P (TREE_VEC_ELT (t, 0))
+ && TREE_CODE (elts[0]) == TREE_VEC)
+ return elts[0];
+
/* Make space for the expanded arguments coming from template
argument packs. */
- t = make_tree_vec (len + expanded_len_adjust);
- /* ORIG_T can contain TREE_VECs. That happens if ORIG_T contains the
+ tree r = make_tree_vec (len + expanded_len_adjust);
+ /* T can contain TREE_VECs. That happens if T contains the
arguments for a member template.
- In that case each TREE_VEC in ORIG_T represents a level of template
- arguments, and ORIG_T won't carry any non defaulted argument count.
+ In that case each TREE_VEC in T represents a level of template
+ arguments, and T won't carry any non defaulted argument count.
It will rather be the nested TREE_VECs that will carry one.
- In other words, ORIG_T carries a non defaulted argument count only
+ In other words, T carries a non defaulted argument count only
if it doesn't contain any nested TREE_VEC. */
- if (NON_DEFAULT_TEMPLATE_ARGS_COUNT (orig_t))
+ if (NON_DEFAULT_TEMPLATE_ARGS_COUNT (t))
{
- int count = GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT (orig_t);
+ int count = GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT (t);
count += expanded_len_adjust;
- SET_NON_DEFAULT_TEMPLATE_ARGS_COUNT (t, count);
+ SET_NON_DEFAULT_TEMPLATE_ARGS_COUNT (r, count);
}
- for (i = 0, out = 0; i < len; i++)
+
+ int out = 0;
+ for (int i = 0; i < len; i++)
{
- tree orig_arg = TREE_VEC_ELT (orig_t, i);
+ tree orig_arg = TREE_VEC_ELT (t, i);
if (orig_arg
- && (PACK_EXPANSION_P (orig_arg) || ARGUMENT_PACK_P (orig_arg))
+ && PACK_EXPANSION_P (orig_arg)
&& TREE_CODE (elts[i]) == TREE_VEC)
{
- int idx;
-
/* Now expand the template argument pack "in place". */
- for (idx = 0; idx < TREE_VEC_LENGTH (elts[i]); idx++, out++)
- TREE_VEC_ELT (t, out) = TREE_VEC_ELT (elts[i], idx);
+ for (int idx = 0; idx < TREE_VEC_LENGTH (elts[i]); idx++, out++)
+ TREE_VEC_ELT (r, out) = TREE_VEC_ELT (elts[i], idx);
}
else
{
- TREE_VEC_ELT (t, out) = elts[i];
+ TREE_VEC_ELT (r, out) = elts[i];
out++;
}
}
+ gcc_assert (out <= TREE_VEC_LENGTH (r));
- return t;
+ return r;
}
/* Substitute ARGS into one level PARMS of template parameters. */
@@ -14965,32 +15052,21 @@ tsubst_decl (tree t, tree args, tsubst_flags_t
complain)
if (!spec)
{
- int args_depth = TMPL_ARGS_DEPTH (args);
- int parms_depth = TMPL_ARGS_DEPTH (DECL_TI_ARGS (t));
tmpl = DECL_TI_TEMPLATE (t);
gen_tmpl = most_general_template (tmpl);
- if (args_depth == parms_depth
- && !PRIMARY_TEMPLATE_P (gen_tmpl))
- /* The DECL_TI_ARGS in this case are the generic template
- arguments for the enclosing class template, so we can
- shortcut substitution (which would just be the identity
- mapping). */
- argvec = args;
- else
- {
argvec = tsubst (DECL_TI_ARGS (t), args, complain, in_decl);
- /* Coerce the innermost arguments again if necessary. If
- there's fewer levels of args than of parms, then the
- substitution could not have changed the innermost args
- (modulo level lowering). */
- if (args_depth >= parms_depth && argvec != error_mark_node)
+ if (argvec != error_mark_node
+ && PRIMARY_TEMPLATE_P (gen_tmpl)
+ && TMPL_ARGS_DEPTH (args) >= TMPL_ARGS_DEPTH (argvec))
+ /* We're fully specializing a template declaration, so
+ we need to coerce the innermost arguments corresponding to
+ the template. */
argvec = (coerce_innermost_template_parms
(DECL_TEMPLATE_PARMS (gen_tmpl),
argvec, t, complain,
/*all*/true, /*defarg*/true));
if (argvec == error_mark_node)
RETURN (error_mark_node);
- }
hash = spec_hasher::hash (gen_tmpl, argvec);
spec = retrieve_specialization (gen_tmpl, argvec, hash);
}
diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic183.C
b/gcc/testsuite/g++.dg/cpp0x/variadic183.C
new file mode 100644
index 00000000000..27444ebb594
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/variadic183.C
@@ -0,0 +1,14 @@
+// PR c++/105956
+// { dg-do compile { target c++11 } }
+
+template<int...> struct list;
+
+template<int... Ns> struct impl {
+ static const int idx = 0;
+ using type = list<(idx + Ns)...>;
+
+ static constexpr const int* a[2] = {(Ns, &idx)...};
+ static_assert(a[0] == &idx && a[1] == &idx, "");
+};
+
+template struct impl<0, 1>;