Hi,
On Thu, Oct 31, 2013 at 10:04:45PM -0500, Aldy Hernandez wrote:
> Hello gentlemen. I'm CCing all of you, because each of you can
> provide valuable feedback to various parts of the compiler which I
> touch. I have sprinkled love notes with your names throughout the
> post :).
sorry it took me so long, for various reasons I out of my control I've
accumulated quite a backlog of email and tasks last week and it took
me a lot of time to chew through it all.
...
> [Martin] I have adapted the ipa_parm_adjustment infrastructure to
> allow adding new arguments out of the blue like you mentioned was
> missing in ipa-prop.h. I have also added support for creating
> vectors of arguments. Could you take a look at my changes to
> ipa-prop.[ch]?
Sure, though I have only looked at ipa-* and tree-sra.c stuff. I do
not have any real objections but would suggest a few amendments.
I am glad this is becoming a useful infrastructure rather than just a
part of IPA-SRA. Note that while ipa_combine_adjustments is not used
from anywhere and thus probably buggy anyway, it should in theory be
able to process new_param adjustments too. Can you please at least
put a "not implemented" assert there? (The reason is that the plan
still is to replace args_to_skip bitmaps in cgraphclones.c by
adjustments one day and we do need to combine clones.)
>
> [Martin] I need to add new arguments in the case of inbranch clones,
> which add an additional vector with a mask as the last argument:
> For the following:
>
> #pragma omp declare simd simdlen(4) inbranch
> int foo (int a)
> {
> return a + 1234;
> }
>
> ...we would generate a clone with:
>
> vector(4) int
> foo.simdclone.0 (vector(4) int simd.4, vector(4) int mask.5)
>
> I thought it best to enhance ipa_modify_formal_parameters() and
> associated machinery than to add the new argument ad-hoc. We
> already have enough ways of doing tree and cgraph versioning in the
> compiler ;-).
>
...
> gcc/ChangeLog.elementals
>
> * Makefile.in (omp-low.o): Depend on PRETTY_PRINT_H and IPA_PROP_H.
> * tree-vect-stmts.c (vectorizable_call): Allow > 3 arguments when
> a SIMD clone may be available.
> (vectorizable_function): Use SIMD clone if available.
> * ipa-cp.c (determine_versionability): Nodes with SIMD clones are
> not versionable.
> * ggc.h (ggc_alloc_cleared_simd_clone_stat): New.
> * cgraph.h (enum linear_stride_type): New.
> (struct simd_clone_arg): New.
> (struct simd_clone): New.
> (struct cgraph_node): Add simdclone and simdclone_of fields.
> (get_simd_clone): Protoize.
> * cgraph.c (get_simd_clone): New.
> Add `has_simd_clones' field.
> * ipa-cp.c (determine_versionability): Disallow functions with
> simd clones.
(This looks like a repeated entry.)
> * ipa-prop.h (ipa_sra_modify_function_body): Protoize.
> (sra_ipa_modify_expr): Same.
> (struct ipa_parm_adjustment): Add new_arg_prefix and new_param
> fields. Document their use.
> * ipa-prop.c (ipa_modify_formal_parameters): Handle creating brand
> new parameters and minor cleanups.
> * omp-low.c: Add new pass_omp_simd_clone support code.
> (make_pass_omp_simd_clone): New.
> (pass_data_omp_simd_clone): Declare.
> (class pass_omp_simd_clone): Declare.
> (vecsize_mangle): New.
> (ipa_omp_simd_clone): New.
> (simd_clone_clauses_extract): New.
> (simd_clone_compute_base_data_type): New.
> (simd_clone_compute_vecsize_and_simdlen): New.
> (simd_clone_create): New.
> (simd_clone_adjust_return_type): New.
> (simd_clone_adjust_return_types): New.
> (simd_clone_adjust): New.
> (simd_clone_init_simd_arrays): New.
> (ipa_simd_modify_function_body): New.
> (simd_clone_mangle): New.
> (simd_clone_struct_alloc): New.
> (simd_clone_struct_copy): New.
> (class argno_map): New.
> (argno_map::argno_map(tree)): New.
> (argno_map::~argno_map): New.
> (argno_map::operator []): New.
> (argno_map::length): New.
> (expand_simd_clones): New.
> (create_tmp_simd_array): New.
> * tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): New.
> * tree-core.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Document.
> * tree-pass.h (make_pass_omp_simd_clone): New.
> * passes.def (pass_omp_simd_clone): New.
> * target.def: Define new hook prefix "TARGET_CILKPLUS_".
> (default_vecsize_mangle): New.
> (vecsize_for_mangle): New.
> * doc/tm.texi.in: Add placeholder for
> TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE and
> TARGET_CILKPLUS_VECSIZE_FOR_MANGLE.
> * tree-sra.c (sra_ipa_modify_expr): Remove static modifier.
> (ipa_sra_modify_function_body): Same.
> * tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Define.
> * doc/tm.texi: Regenerate.
> * config/i386/i386.c (ix86_cilkplus_default_vecsize_mangle): New.
> (ix86_cilkplus_vecsize_for_mangle): New.
> (TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE): New.
> (TARGET_CILKPLUS_VECSIZE_FOR_MANGLE): New.
>
...
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index c38ba82..faae080 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -446,6 +446,13 @@ determine_versionability (struct cgraph_node *node)
> reason = "not a tree_versionable_function";
> else if (cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE)
> reason = "insufficient body availability";
> + else if (node->has_simd_clones)
> + {
> + /* Ideally we should clone the SIMD clones themselves and create
> + vector copies of them, so IPA-cp and SIMD clones can happily
> + coexist, but that may not be worth the effort. */
> + reason = "function has SIMD clones";
> + }
Lets hope we will eventually fix this in some followup :-)
>
> if (reason && dump_file && !node->symbol.alias && !node->thunk.thunk_p)
> fprintf (dump_file, "Function %s/%i is not versionable, reason: %s.\n",
> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> index 2fbc9d4..0c20dc6 100644
> --- a/gcc/ipa-prop.c
> +++ b/gcc/ipa-prop.c
> @@ -3361,24 +3361,18 @@ void
> ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec
> adjustments,
> const char *synth_parm_prefix)
> {
> - vec<tree> oparms, otypes;
> - tree orig_type, new_type = NULL;
> - tree old_arg_types, t, new_arg_types = NULL;
> - tree parm, *link = &DECL_ARGUMENTS (fndecl);
> - int i, len = adjustments.length ();
> - tree new_reversed = NULL;
> - bool care_for_types, last_parm_void;
> -
> if (!synth_parm_prefix)
> synth_parm_prefix = "SYNTH";
>
> - oparms = ipa_get_vector_of_formal_parms (fndecl);
> - orig_type = TREE_TYPE (fndecl);
> - old_arg_types = TYPE_ARG_TYPES (orig_type);
> + vec<tree> oparms = ipa_get_vector_of_formal_parms (fndecl);
> + tree orig_type = TREE_TYPE (fndecl);
> + tree old_arg_types = TYPE_ARG_TYPES (orig_type);
>
> /* The following test is an ugly hack, some functions simply don't have any
> arguments in their type. This is probably a bug but well... */
> - care_for_types = (old_arg_types != NULL_TREE);
> + bool care_for_types = (old_arg_types != NULL_TREE);
> + bool last_parm_void;
> + vec<tree> otypes;
> if (care_for_types)
> {
> last_parm_void = (TREE_VALUE (tree_last (old_arg_types))
> @@ -3395,13 +3389,20 @@ ipa_modify_formal_parameters (tree fndecl,
> ipa_parm_adjustment_vec adjustments,
> otypes.create (0);
> }
>
> - for (i = 0; i < len; i++)
> + int len = adjustments.length ();
> + tree *link = &DECL_ARGUMENTS (fndecl);
> + tree new_arg_types = NULL;
> + for (int i = 0; i < len; i++)
> {
> struct ipa_parm_adjustment *adj;
> gcc_assert (link);
>
> adj = &adjustments[i];
> - parm = oparms[adj->base_index];
> + tree parm;
> + if (adj->new_param)
I don't know what I was thinking when I invented copy_param and
remove_param as multiple flags rather than a single enum, I probably
wasn't thinking at all. I can change it myself as a followup if you
have more pressing tasks now. Meanwhile, can you gcc_checking_assert
that at most one flag is set at appropriate places?
> + parm = NULL;
> + else
> + parm = oparms[adj->base_index];
> adj->base = parm;
I do not think it makes sense for new parameters to have a base which
is basically the old decl. Do you have any reasons for not setting it
to NULL?
>
> if (adj->copy_param)
> @@ -3417,8 +3418,18 @@ ipa_modify_formal_parameters (tree fndecl,
> ipa_parm_adjustment_vec adjustments,
> tree new_parm;
> tree ptype;
>
> - if (adj->by_ref)
> - ptype = build_pointer_type (adj->type);
Please add gcc_checking_assert (!adj->by_ref || adj->simdlen == 0)
here...
> + if (adj->simdlen)
> + {
> + /* If we have a non-null simdlen but by_ref is true, we
> + want a vector of pointers. Build the vector of
> + pointers here, not a pointer to a vector in the
> + adj->by_ref case below. */
> + ptype = build_vector_type (adj->type, adj->simdlen);
> + }
> + else if (adj->by_ref)
...or remove this else and be able to build a pointer to the vector
if by_ref is true.
> + {
> + ptype = build_pointer_type (adj->type);
> + }
> else
> ptype = adj->type;
>
> @@ -3427,8 +3438,9 @@ ipa_modify_formal_parameters (tree fndecl,
> ipa_parm_adjustment_vec adjustments,
>
> new_parm = build_decl (UNKNOWN_LOCATION, PARM_DECL, NULL_TREE,
> ptype);
> - DECL_NAME (new_parm) = create_tmp_var_name (synth_parm_prefix);
> -
> + const char *prefix
> + = adj->new_param ? adj->new_arg_prefix : synth_parm_prefix;
Can we perhaps get rid of synth_parm_prefix then and just have
adj->new_arg_prefix? It's not particularly important but this is
weird.
> + DECL_NAME (new_parm) = create_tmp_var_name (prefix);
> DECL_ARTIFICIAL (new_parm) = 1;
> DECL_ARG_TYPE (new_parm) = ptype;
> DECL_CONTEXT (new_parm) = fndecl;
> @@ -3436,17 +3448,20 @@ ipa_modify_formal_parameters (tree fndecl,
> ipa_parm_adjustment_vec adjustments,
> DECL_IGNORED_P (new_parm) = 1;
> layout_decl (new_parm, 0);
>
> - adj->base = parm;
> + if (adj->new_param)
> + adj->base = new_parm;
Again, shouldn't this be NULL?
> + else
> + adj->base = parm;
> adj->reduction = new_parm;
>
> *link = new_parm;
> -
> link = &DECL_CHAIN (new_parm);
> }
> }
>
> *link = NULL_TREE;
>
> + tree new_reversed = NULL;
> if (care_for_types)
> {
> new_reversed = nreverse (new_arg_types);
> @@ -3464,6 +3479,7 @@ ipa_modify_formal_parameters (tree fndecl,
> ipa_parm_adjustment_vec adjustments,
> Exception is METHOD_TYPEs must have THIS argument.
> When we are asked to remove it, we need to build new FUNCTION_TYPE
> instead. */
> + tree new_type = NULL;
> if (TREE_CODE (orig_type) != METHOD_TYPE
> || (adjustments[0].copy_param
> && adjustments[0].base_index == 0))
> @@ -3489,7 +3505,7 @@ ipa_modify_formal_parameters (tree fndecl,
> ipa_parm_adjustment_vec adjustments,
>
> /* This is a new type, not a copy of an old type. Need to reassociate
> variants. We can handle everything except the main variant lazily. */
> - t = TYPE_MAIN_VARIANT (orig_type);
> + tree t = TYPE_MAIN_VARIANT (orig_type);
> if (orig_type != t)
> {
> TYPE_MAIN_VARIANT (new_type) = t;
> diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
> index 48634d2..8d7d9b9 100644
> --- a/gcc/ipa-prop.h
> +++ b/gcc/ipa-prop.h
> @@ -634,9 +634,10 @@ struct ipa_parm_adjustment
> arguments. */
> tree alias_ptr_type;
>
> - /* The new declaration when creating/replacing a parameter. Created by
> - ipa_modify_formal_parameters, useful for functions modifying the body
> - accordingly. */
> + /* The new declaration when creating/replacing a parameter. Created
> + by ipa_modify_formal_parameters, useful for functions modifying
> + the body accordingly. For brand new arguments, this is the newly
> + created argument. */
> tree reduction;
We should eventually rename this to new_decl or something, given that
this is not an SRA thing any more. But that can be done later.
>
> /* New declaration of a substitute variable that we may use to replace all
> @@ -647,15 +648,36 @@ struct ipa_parm_adjustment
> is NULL), this is going to be its nonlocalized vars value. */
> tree nonlocal_value;
>
> + /* If this is a brand new argument, this holds the prefix to be used
> + for the DECL_NAME. */
> + const char *new_arg_prefix;
> +
> /* Offset into the original parameter (for the cases when the new parameter
> is a component of an original one). */
> HOST_WIDE_INT offset;
>
> - /* Zero based index of the original parameter this one is based on. (ATM
> - there is no way to insert a new parameter out of the blue because there
> is
> - no need but if it arises the code can be easily exteded to do so.) */
> + /* Zero based index of the original parameter this one is based on. */
> int base_index;
>
> + /* If non-null, the parameter is a vector of `type' with this many
> + elements. */
> + int simdlen;
> +
> + /* This is a brand new parameter.
> +
> + For new parameters, base_index must be >= the number of
> + DECL_ARGUMENTS in the function. That is, new arguments will be
> + the last arguments in the adjusted function.
> +
> + ?? Perhaps we could redesign ipa_modify_formal_parameters() to
> + reorganize argument position, thus allowing inserting of brand
> + new arguments anywhere, but there is no use for this now.
Where does this requirement come from? At least at the moment I
cannot see why ipa_modify_formal_parameters wouldn't be able to
reorder parameters as it is? What breaks if base_index of adjustments
for new parameters has zero or a nonsensical value?
> +
> + Also, `type' should be set to the new type, `new_arg_prefix'
> + should be set to the string prefix for the new DECL_NAME, and
> + `reduction' will ultimately hold the newly created argument. */
> + unsigned new_param : 1;
> +
> /* This new parameter is an unmodified parameter at index base_index. */
> unsigned copy_param : 1;
>
> @@ -697,5 +719,7 @@ void ipa_dump_param (FILE *, struct ipa_node_params
> *info, int i);
> /* From tree-sra.c: */
> tree build_ref_for_offset (location_t, tree, HOST_WIDE_INT, tree,
> gimple_stmt_iterator *, bool);
> +bool ipa_sra_modify_function_body (ipa_parm_adjustment_vec);
> +bool sra_ipa_modify_expr (tree *, bool, ipa_parm_adjustment_vec);
>
Hm, if you can directly use these, I really think you should rename
them somehow so that their names do not contain SRA and move them to
ipa-prop.c.
Thanks for reviving this slightly moribund infrastructure and sorry
again for the delay,
Martin