On Fri, 1 Jul 2022, Jason Merrill wrote: > On 6/29/22 13:42, Patrick Palka wrote: > > In r13-1045-gcb7fd1ea85feea I assumed that substitution into generic > > DECL_TI_ARGS corresponds to an identity mapping of the given arguments, > > and hence its safe to always elide such substitution. But this PR > > demonstrates that such a substitution isn't always the identity mapping, > > in particular when there's an ARGUMENT_PACK_SELECT argument, which gets > > handled specially during substitution: > > > > * when substituting an APS into a template parameter, we strip the > > APS to its underlying argument; > > * and when substituting an APS into a pack expansion, we strip the > > APS to its underlying argument pack. > > Ah, right. For instance, in variadic96.C we have > > 10 template < typename... T > > 11 struct derived > 12 : public base< T, derived< T... > >... > > so when substituting into the base-specifier, we're approaching it from the > outside in, so when we get to the inner T... we need some way to find the T > pack again. It might be possible to remove the need for APS by substituting > inner pack expansions before outer ones, which could improve worst-case > complexity, but I don't know how relevant that is in real code; I imagine most > inner pack expansions are as simple as this one.
Aha, that makes sense. > > > In this testcase, when expanding the pack expansion pattern (idx + Ns)... > > with Ns={0,1}, we specialize idx twice, first with Ns=APS<0,{0,1}> and > > then Ns=APS<1,{0,1}>. The DECL_TI_ARGS of idx are the generic template > > arguments of the enclosing class template impl, so before r13-1045, > > we'd substitute into its DECL_TI_ARGS which gave Ns={0,1} as desired. > > But after r13-1045, we elide this substitution and end up attempting to > > hash the original Ns argument, an APS, which ICEs. > > > > So this patch partially reverts this part of r13-1045. I considered > > using preserve_args in this case instead, but that'd break the > > static_assert in the testcase because preserve_args always strips APS to > > its underlying argument, but here we want to strip it to its underlying > > argument pack, so we'd incorrectly end up forming the specializations > > impl<0>::idx and impl<1>::idx instead of impl<0,1>::idx. > > > > Although we can't elide the substitution into DECL_TI_ARGS in light of > > ARGUMENT_PACK_SELECT, it should still be safe to elide template argument > > coercion in the case of a non-template decl, which this patch preserves. > > > > It's unfortunate that we need to remove this optimization just because > > it doesn't hold for one special tree code. So this patch implements a > > heuristic in tsubst_template_args to avoid allocating a new TREE_VEC if > > the substituted elements are identical to those of a level from ARGS. > > It turns out that about 30% of all calls to tsubst_template_args benefit > > from this optimization, and it reduces memory usage by about 1.5% for > > e.g. stdc++.h (relative to r13-1045). (This is the maybe_reuse stuff, > > the rest of the changes to tsubst_template_args are just drive-by > > cleanups.) > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for > > trunk? Patch generated with -w to ignore noisy whitespace changes. > > > > PR c++/105956 > > > > gcc/cp/ChangeLog: > > > > * pt.cc (tsubst_template_args): Move variable declarations > > closer to their first use. Replace 'orig_t' with 'r'. Rename > > 'need_new' to 'const_subst_p'. Heuristically detect if the > > substituted elements are identical to that of a level from > > 'args' and avoid allocating a new TREE_VEC if so. > > (tsubst_decl) <case TYPE_DECL, case VAR_DECL>: Revert > > r13-1045-gcb7fd1ea85feea change for avoiding substitution into > > DECL_TI_ARGS, but still avoid coercion in this case. > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/cpp0x/variadic183.C: New test. > > --- > > gcc/cp/pt.cc | 113 ++++++++++++++--------- > > gcc/testsuite/g++.dg/cpp0x/variadic183.C | 14 +++ > > 2 files changed, 85 insertions(+), 42 deletions(-) > > create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic183.C > > > > diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc > > index 8672da123f4..7898834faa6 100644 > > --- a/gcc/cp/pt.cc > > +++ b/gcc/cp/pt.cc > > @@ -27,6 +27,7 @@ along with GCC; see the file COPYING3. If not see > > Fixed by: C++20 modules. */ > > #include "config.h" > > +#define INCLUDE_ALGORITHM // for std::equal > > #include "system.h" > > #include "coretypes.h" > > #include "cp-tree.h" > > @@ -13544,17 +13545,22 @@ tsubst_argument_pack (tree orig_arg, tree args, > > tsubst_flags_t complain, > > tree > > tsubst_template_args (tree t, tree args, tsubst_flags_t complain, tree > > in_decl) > > { > > - tree orig_t = t; > > - int len, need_new = 0, i, expanded_len_adjust = 0, out; > > - tree *elts; > > - > > if (t == error_mark_node) > > return error_mark_node; > > - len = TREE_VEC_LENGTH (t); > > - elts = XALLOCAVEC (tree, len); > > + const int len = TREE_VEC_LENGTH (t); > > + tree *elts = XALLOCAVEC (tree, len); > > + int expanded_len_adjust = 0; > > - for (i = 0; i < len; i++) > > + /* True iff no element of T was changed by the substitution. */ > > + bool const_subst_p = true; > > + > > + /* If MAYBE_REUSE is non-NULL, as an optimization we'll try to reuse and > > + return this TREE_VEC instead of allocating a new one if possible. > > This > > + will either be ARGS or a level from ARGS. */ > > + tree maybe_reuse = NULL_TREE; > > + > > + for (int i = 0; i < len; i++) > > { > > tree orig_arg = TREE_VEC_ELT (t, i); > > tree new_arg; > > @@ -13580,56 +13586,90 @@ tsubst_template_args (tree t, tree args, > > tsubst_flags_t complain, tree in_decl) > > else if (ARGUMENT_PACK_P (orig_arg)) > > new_arg = tsubst_argument_pack (orig_arg, args, complain, in_decl); > > else > > + { > > new_arg = tsubst_template_arg (orig_arg, args, complain, in_decl); > > + /* If T heuristically appears to be a set of generic template > > + arguments, set MAYBE_REUSE to the corresponding level from > > + ARGS. */ > > + if (maybe_reuse == NULL_TREE && orig_arg != NULL_TREE) > > + { > > + if (TEMPLATE_PARM_P (orig_arg)) > > + { > > This doesn't handle the case of variadic template parameters, which are > represented in the generic args with a pack expansion. If this is a choice, > there should be a comment about it. AFAICT substituting such a pack expansion will never be an identity transform -- the relevant targ during tsubst_pack_expansion will be an ARGUMENT_PACK, and we'll return the TREE_VEC from that ARGUMENT_PACK instead of the ARGUMENT_PACK itself, so there's no way we can reuse (a level from) 'args' in this case. So variadic template parameters act as an optimization barrier here unfortunately. I can add a comment to that effect, and perhaps explicitly give up in this case by also setting maybe_reuse to error_mark_node in the PACK_EXPANSION_P branch of tsubst_template_args? > > > + int level, index; > > + template_parm_level_and_index (orig_arg, &level, &index); > > + if (index == i && TMPL_ARGS_DEPTH (args) >= level) > > + maybe_reuse = TMPL_ARGS_LEVEL (args, level); > > + else > > + maybe_reuse = error_mark_node; > > + } > > + else > > + /* T is not a set of generic template arguments; use > > + error_mark_node to denote this. */ > > + maybe_reuse = error_mark_node; > > + } > > + } > > + > > if (new_arg == error_mark_node) > > return error_mark_node; > > elts[i] = new_arg; > > if (new_arg != orig_arg) > > - need_new = 1; > > + const_subst_p = false; > > } > > - if (!need_new) > > + if (const_subst_p) > > return t; > > + /* If ARGS and T are both multi-level, then substituted T may be > > + identical to ARGS (if each level was pairwise identical). */ > > + if (maybe_reuse == NULL_TREE > > + && TMPL_ARGS_HAVE_MULTIPLE_LEVELS (t) > > + && TMPL_ARGS_HAVE_MULTIPLE_LEVELS (args)) > > + maybe_reuse = args; > > + > > + /* Return MAYBE_REUSE and avoid allocating a new TREE_VEC if the > > substituted > > + result is identical to it. */ > > + if (NON_ERROR (maybe_reuse) != NULL_TREE > > + && TREE_VEC_LENGTH (maybe_reuse) == len > > + && std::equal (elts, elts+len, TREE_VEC_BEGIN (maybe_reuse))) > > + return maybe_reuse; > > + > > /* Make space for the expanded arguments coming from template > > argument packs. */ > > - t = make_tree_vec (len + expanded_len_adjust); > > - /* ORIG_T can contain TREE_VECs. That happens if ORIG_T contains the > > + tree r = make_tree_vec (len + expanded_len_adjust); > > + /* T can contain TREE_VECs. That happens if T contains the > > arguments for a member template. > > - In that case each TREE_VEC in ORIG_T represents a level of template > > - arguments, and ORIG_T won't carry any non defaulted argument count. > > + In that case each TREE_VEC in T represents a level of template > > + arguments, and T won't carry any non defaulted argument count. > > It will rather be the nested TREE_VECs that will carry one. > > - In other words, ORIG_T carries a non defaulted argument count only > > + In other words, T carries a non defaulted argument count only > > if it doesn't contain any nested TREE_VEC. */ > > - if (NON_DEFAULT_TEMPLATE_ARGS_COUNT (orig_t)) > > + if (NON_DEFAULT_TEMPLATE_ARGS_COUNT (t)) > > { > > - int count = GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT (orig_t); > > + int count = GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT (t); > > count += expanded_len_adjust; > > - SET_NON_DEFAULT_TEMPLATE_ARGS_COUNT (t, count); > > + SET_NON_DEFAULT_TEMPLATE_ARGS_COUNT (r, count); > > } > > - for (i = 0, out = 0; i < len; i++) > > + for (int i = 0, out = 0; i < len; i++) > > { > > - tree orig_arg = TREE_VEC_ELT (orig_t, i); > > + tree orig_arg = TREE_VEC_ELT (t, i); > > if (orig_arg > > && (PACK_EXPANSION_P (orig_arg) || ARGUMENT_PACK_P (orig_arg)) > > && TREE_CODE (elts[i]) == TREE_VEC) > > { > > - int idx; > > - > > /* Now expand the template argument pack "in place". */ > > - for (idx = 0; idx < TREE_VEC_LENGTH (elts[i]); idx++, out++) > > - TREE_VEC_ELT (t, out) = TREE_VEC_ELT (elts[i], idx); > > + for (int idx = 0; idx < TREE_VEC_LENGTH (elts[i]); idx++, out++) > > + TREE_VEC_ELT (r, out) = TREE_VEC_ELT (elts[i], idx); > > } > > else > > { > > - TREE_VEC_ELT (t, out) = elts[i]; > > + TREE_VEC_ELT (r, out) = elts[i]; > > out++; > > } > > } > > - return t; > > + return r; > > } > > /* Substitute ARGS into one level PARMS of template parameters. */ > > @@ -14965,32 +15005,21 @@ tsubst_decl (tree t, tree args, tsubst_flags_t > > complain) > > if (!spec) > > { > > - int args_depth = TMPL_ARGS_DEPTH (args); > > - int parms_depth = TMPL_ARGS_DEPTH (DECL_TI_ARGS (t)); > > tmpl = DECL_TI_TEMPLATE (t); > > gen_tmpl = most_general_template (tmpl); > > - if (args_depth == parms_depth > > - && !PRIMARY_TEMPLATE_P (gen_tmpl)) > > - /* The DECL_TI_ARGS in this case are the generic template > > - arguments for the enclosing class template, so we can > > - shortcut substitution (which would just be the identity > > - mapping). */ > > - argvec = args; > > - else > > - { > > argvec = tsubst (DECL_TI_ARGS (t), args, complain, in_decl); > > - /* Coerce the innermost arguments again if necessary. If > > - there's fewer levels of args than of parms, then the > > - substitution could not have changed the innermost args > > - (modulo level lowering). */ > > - if (args_depth >= parms_depth && argvec != > > error_mark_node) > > + if (argvec != error_mark_node > > + && PRIMARY_TEMPLATE_P (gen_tmpl) > > + && TMPL_ARGS_DEPTH (args) >= TMPL_ARGS_DEPTH (argvec)) > > + /* We're fully specializing a template declaration, so > > + we need to coerce the innermost arguments corresponding > > + to the template. */ > > argvec = (coerce_innermost_template_parms > > (DECL_TEMPLATE_PARMS (gen_tmpl), > > argvec, t, complain, > > /*all*/true, /*defarg*/true)); > > if (argvec == error_mark_node) > > RETURN (error_mark_node); > > - } > > hash = spec_hasher::hash (gen_tmpl, argvec); > > spec = retrieve_specialization (gen_tmpl, argvec, hash); > > } > > diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic183.C > > b/gcc/testsuite/g++.dg/cpp0x/variadic183.C > > new file mode 100644 > > index 00000000000..3938e52e0e7 > > --- /dev/null > > +++ b/gcc/testsuite/g++.dg/cpp0x/variadic183.C > > @@ -0,0 +1,14 @@ > > +// PR c++/105956 > > +// { dg-do compile { target c++11 } } > > + > > +template<int... Ns> struct list; > > + > > +template<int... Ns> struct impl { > > + static const int idx = 0; > > + using type = list<(idx + Ns)...>; > > + > > + static constexpr const int* a[2] = {(Ns, &idx)...}; > > + static_assert(a[0] == &idx && a[1] == &idx, ""); > > +}; > > + > > +template struct impl<0, 1>; > >