Hi, On Thu, Oct 31, 2013 at 10:04:45PM -0500, Aldy Hernandez wrote: > Hello gentlemen. I'm CCing all of you, because each of you can > provide valuable feedback to various parts of the compiler which I > touch. I have sprinkled love notes with your names throughout the > post :).
sorry it took me so long, for various reasons I out of my control I've accumulated quite a backlog of email and tasks last week and it took me a lot of time to chew through it all. ... > [Martin] I have adapted the ipa_parm_adjustment infrastructure to > allow adding new arguments out of the blue like you mentioned was > missing in ipa-prop.h. I have also added support for creating > vectors of arguments. Could you take a look at my changes to > ipa-prop.[ch]? Sure, though I have only looked at ipa-* and tree-sra.c stuff. I do not have any real objections but would suggest a few amendments. I am glad this is becoming a useful infrastructure rather than just a part of IPA-SRA. Note that while ipa_combine_adjustments is not used from anywhere and thus probably buggy anyway, it should in theory be able to process new_param adjustments too. Can you please at least put a "not implemented" assert there? (The reason is that the plan still is to replace args_to_skip bitmaps in cgraphclones.c by adjustments one day and we do need to combine clones.) > > [Martin] I need to add new arguments in the case of inbranch clones, > which add an additional vector with a mask as the last argument: > For the following: > > #pragma omp declare simd simdlen(4) inbranch > int foo (int a) > { > return a + 1234; > } > > ...we would generate a clone with: > > vector(4) int > foo.simdclone.0 (vector(4) int simd.4, vector(4) int mask.5) > > I thought it best to enhance ipa_modify_formal_parameters() and > associated machinery than to add the new argument ad-hoc. We > already have enough ways of doing tree and cgraph versioning in the > compiler ;-). > ... > gcc/ChangeLog.elementals > > * Makefile.in (omp-low.o): Depend on PRETTY_PRINT_H and IPA_PROP_H. > * tree-vect-stmts.c (vectorizable_call): Allow > 3 arguments when > a SIMD clone may be available. > (vectorizable_function): Use SIMD clone if available. > * ipa-cp.c (determine_versionability): Nodes with SIMD clones are > not versionable. > * ggc.h (ggc_alloc_cleared_simd_clone_stat): New. > * cgraph.h (enum linear_stride_type): New. > (struct simd_clone_arg): New. > (struct simd_clone): New. > (struct cgraph_node): Add simdclone and simdclone_of fields. > (get_simd_clone): Protoize. > * cgraph.c (get_simd_clone): New. > Add `has_simd_clones' field. > * ipa-cp.c (determine_versionability): Disallow functions with > simd clones. (This looks like a repeated entry.) > * ipa-prop.h (ipa_sra_modify_function_body): Protoize. > (sra_ipa_modify_expr): Same. > (struct ipa_parm_adjustment): Add new_arg_prefix and new_param > fields. Document their use. > * ipa-prop.c (ipa_modify_formal_parameters): Handle creating brand > new parameters and minor cleanups. > * omp-low.c: Add new pass_omp_simd_clone support code. > (make_pass_omp_simd_clone): New. > (pass_data_omp_simd_clone): Declare. > (class pass_omp_simd_clone): Declare. > (vecsize_mangle): New. > (ipa_omp_simd_clone): New. > (simd_clone_clauses_extract): New. > (simd_clone_compute_base_data_type): New. > (simd_clone_compute_vecsize_and_simdlen): New. > (simd_clone_create): New. > (simd_clone_adjust_return_type): New. > (simd_clone_adjust_return_types): New. > (simd_clone_adjust): New. > (simd_clone_init_simd_arrays): New. > (ipa_simd_modify_function_body): New. > (simd_clone_mangle): New. > (simd_clone_struct_alloc): New. > (simd_clone_struct_copy): New. > (class argno_map): New. > (argno_map::argno_map(tree)): New. > (argno_map::~argno_map): New. > (argno_map::operator []): New. > (argno_map::length): New. > (expand_simd_clones): New. > (create_tmp_simd_array): New. > * tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): New. > * tree-core.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Document. > * tree-pass.h (make_pass_omp_simd_clone): New. > * passes.def (pass_omp_simd_clone): New. > * target.def: Define new hook prefix "TARGET_CILKPLUS_". > (default_vecsize_mangle): New. > (vecsize_for_mangle): New. > * doc/tm.texi.in: Add placeholder for > TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE and > TARGET_CILKPLUS_VECSIZE_FOR_MANGLE. > * tree-sra.c (sra_ipa_modify_expr): Remove static modifier. > (ipa_sra_modify_function_body): Same. > * tree.h (OMP_CLAUSE_LINEAR_VARIABLE_STRIDE): Define. > * doc/tm.texi: Regenerate. > * config/i386/i386.c (ix86_cilkplus_default_vecsize_mangle): New. > (ix86_cilkplus_vecsize_for_mangle): New. > (TARGET_CILKPLUS_DEFAULT_VECSIZE_MANGLE): New. > (TARGET_CILKPLUS_VECSIZE_FOR_MANGLE): New. > ... > diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c > index c38ba82..faae080 100644 > --- a/gcc/ipa-cp.c > +++ b/gcc/ipa-cp.c > @@ -446,6 +446,13 @@ determine_versionability (struct cgraph_node *node) > reason = "not a tree_versionable_function"; > else if (cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE) > reason = "insufficient body availability"; > + else if (node->has_simd_clones) > + { > + /* Ideally we should clone the SIMD clones themselves and create > + vector copies of them, so IPA-cp and SIMD clones can happily > + coexist, but that may not be worth the effort. */ > + reason = "function has SIMD clones"; > + } Lets hope we will eventually fix this in some followup :-) > > if (reason && dump_file && !node->symbol.alias && !node->thunk.thunk_p) > fprintf (dump_file, "Function %s/%i is not versionable, reason: %s.\n", > diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c > index 2fbc9d4..0c20dc6 100644 > --- a/gcc/ipa-prop.c > +++ b/gcc/ipa-prop.c > @@ -3361,24 +3361,18 @@ void > ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec > adjustments, > const char *synth_parm_prefix) > { > - vec<tree> oparms, otypes; > - tree orig_type, new_type = NULL; > - tree old_arg_types, t, new_arg_types = NULL; > - tree parm, *link = &DECL_ARGUMENTS (fndecl); > - int i, len = adjustments.length (); > - tree new_reversed = NULL; > - bool care_for_types, last_parm_void; > - > if (!synth_parm_prefix) > synth_parm_prefix = "SYNTH"; > > - oparms = ipa_get_vector_of_formal_parms (fndecl); > - orig_type = TREE_TYPE (fndecl); > - old_arg_types = TYPE_ARG_TYPES (orig_type); > + vec<tree> oparms = ipa_get_vector_of_formal_parms (fndecl); > + tree orig_type = TREE_TYPE (fndecl); > + tree old_arg_types = TYPE_ARG_TYPES (orig_type); > > /* The following test is an ugly hack, some functions simply don't have any > arguments in their type. This is probably a bug but well... */ > - care_for_types = (old_arg_types != NULL_TREE); > + bool care_for_types = (old_arg_types != NULL_TREE); > + bool last_parm_void; > + vec<tree> otypes; > if (care_for_types) > { > last_parm_void = (TREE_VALUE (tree_last (old_arg_types)) > @@ -3395,13 +3389,20 @@ ipa_modify_formal_parameters (tree fndecl, > ipa_parm_adjustment_vec adjustments, > otypes.create (0); > } > > - for (i = 0; i < len; i++) > + int len = adjustments.length (); > + tree *link = &DECL_ARGUMENTS (fndecl); > + tree new_arg_types = NULL; > + for (int i = 0; i < len; i++) > { > struct ipa_parm_adjustment *adj; > gcc_assert (link); > > adj = &adjustments[i]; > - parm = oparms[adj->base_index]; > + tree parm; > + if (adj->new_param) I don't know what I was thinking when I invented copy_param and remove_param as multiple flags rather than a single enum, I probably wasn't thinking at all. I can change it myself as a followup if you have more pressing tasks now. Meanwhile, can you gcc_checking_assert that at most one flag is set at appropriate places? > + parm = NULL; > + else > + parm = oparms[adj->base_index]; > adj->base = parm; I do not think it makes sense for new parameters to have a base which is basically the old decl. Do you have any reasons for not setting it to NULL? > > if (adj->copy_param) > @@ -3417,8 +3418,18 @@ ipa_modify_formal_parameters (tree fndecl, > ipa_parm_adjustment_vec adjustments, > tree new_parm; > tree ptype; > > - if (adj->by_ref) > - ptype = build_pointer_type (adj->type); Please add gcc_checking_assert (!adj->by_ref || adj->simdlen == 0) here... > + if (adj->simdlen) > + { > + /* If we have a non-null simdlen but by_ref is true, we > + want a vector of pointers. Build the vector of > + pointers here, not a pointer to a vector in the > + adj->by_ref case below. */ > + ptype = build_vector_type (adj->type, adj->simdlen); > + } > + else if (adj->by_ref) ...or remove this else and be able to build a pointer to the vector if by_ref is true. > + { > + ptype = build_pointer_type (adj->type); > + } > else > ptype = adj->type; > > @@ -3427,8 +3438,9 @@ ipa_modify_formal_parameters (tree fndecl, > ipa_parm_adjustment_vec adjustments, > > new_parm = build_decl (UNKNOWN_LOCATION, PARM_DECL, NULL_TREE, > ptype); > - DECL_NAME (new_parm) = create_tmp_var_name (synth_parm_prefix); > - > + const char *prefix > + = adj->new_param ? adj->new_arg_prefix : synth_parm_prefix; Can we perhaps get rid of synth_parm_prefix then and just have adj->new_arg_prefix? It's not particularly important but this is weird. > + DECL_NAME (new_parm) = create_tmp_var_name (prefix); > DECL_ARTIFICIAL (new_parm) = 1; > DECL_ARG_TYPE (new_parm) = ptype; > DECL_CONTEXT (new_parm) = fndecl; > @@ -3436,17 +3448,20 @@ ipa_modify_formal_parameters (tree fndecl, > ipa_parm_adjustment_vec adjustments, > DECL_IGNORED_P (new_parm) = 1; > layout_decl (new_parm, 0); > > - adj->base = parm; > + if (adj->new_param) > + adj->base = new_parm; Again, shouldn't this be NULL? > + else > + adj->base = parm; > adj->reduction = new_parm; > > *link = new_parm; > - > link = &DECL_CHAIN (new_parm); > } > } > > *link = NULL_TREE; > > + tree new_reversed = NULL; > if (care_for_types) > { > new_reversed = nreverse (new_arg_types); > @@ -3464,6 +3479,7 @@ ipa_modify_formal_parameters (tree fndecl, > ipa_parm_adjustment_vec adjustments, > Exception is METHOD_TYPEs must have THIS argument. > When we are asked to remove it, we need to build new FUNCTION_TYPE > instead. */ > + tree new_type = NULL; > if (TREE_CODE (orig_type) != METHOD_TYPE > || (adjustments[0].copy_param > && adjustments[0].base_index == 0)) > @@ -3489,7 +3505,7 @@ ipa_modify_formal_parameters (tree fndecl, > ipa_parm_adjustment_vec adjustments, > > /* This is a new type, not a copy of an old type. Need to reassociate > variants. We can handle everything except the main variant lazily. */ > - t = TYPE_MAIN_VARIANT (orig_type); > + tree t = TYPE_MAIN_VARIANT (orig_type); > if (orig_type != t) > { > TYPE_MAIN_VARIANT (new_type) = t; > diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h > index 48634d2..8d7d9b9 100644 > --- a/gcc/ipa-prop.h > +++ b/gcc/ipa-prop.h > @@ -634,9 +634,10 @@ struct ipa_parm_adjustment > arguments. */ > tree alias_ptr_type; > > - /* The new declaration when creating/replacing a parameter. Created by > - ipa_modify_formal_parameters, useful for functions modifying the body > - accordingly. */ > + /* The new declaration when creating/replacing a parameter. Created > + by ipa_modify_formal_parameters, useful for functions modifying > + the body accordingly. For brand new arguments, this is the newly > + created argument. */ > tree reduction; We should eventually rename this to new_decl or something, given that this is not an SRA thing any more. But that can be done later. > > /* New declaration of a substitute variable that we may use to replace all > @@ -647,15 +648,36 @@ struct ipa_parm_adjustment > is NULL), this is going to be its nonlocalized vars value. */ > tree nonlocal_value; > > + /* If this is a brand new argument, this holds the prefix to be used > + for the DECL_NAME. */ > + const char *new_arg_prefix; > + > /* Offset into the original parameter (for the cases when the new parameter > is a component of an original one). */ > HOST_WIDE_INT offset; > > - /* Zero based index of the original parameter this one is based on. (ATM > - there is no way to insert a new parameter out of the blue because there > is > - no need but if it arises the code can be easily exteded to do so.) */ > + /* Zero based index of the original parameter this one is based on. */ > int base_index; > > + /* If non-null, the parameter is a vector of `type' with this many > + elements. */ > + int simdlen; > + > + /* This is a brand new parameter. > + > + For new parameters, base_index must be >= the number of > + DECL_ARGUMENTS in the function. That is, new arguments will be > + the last arguments in the adjusted function. > + > + ?? Perhaps we could redesign ipa_modify_formal_parameters() to > + reorganize argument position, thus allowing inserting of brand > + new arguments anywhere, but there is no use for this now. Where does this requirement come from? At least at the moment I cannot see why ipa_modify_formal_parameters wouldn't be able to reorder parameters as it is? What breaks if base_index of adjustments for new parameters has zero or a nonsensical value? > + > + Also, `type' should be set to the new type, `new_arg_prefix' > + should be set to the string prefix for the new DECL_NAME, and > + `reduction' will ultimately hold the newly created argument. */ > + unsigned new_param : 1; > + > /* This new parameter is an unmodified parameter at index base_index. */ > unsigned copy_param : 1; > > @@ -697,5 +719,7 @@ void ipa_dump_param (FILE *, struct ipa_node_params > *info, int i); > /* From tree-sra.c: */ > tree build_ref_for_offset (location_t, tree, HOST_WIDE_INT, tree, > gimple_stmt_iterator *, bool); > +bool ipa_sra_modify_function_body (ipa_parm_adjustment_vec); > +bool sra_ipa_modify_expr (tree *, bool, ipa_parm_adjustment_vec); > Hm, if you can directly use these, I really think you should rename them somehow so that their names do not contain SRA and move them to ipa-prop.c. Thanks for reviving this slightly moribund infrastructure and sorry again for the delay, Martin