Hi, this patch implements the logic to drop base alias sets (and thus also access paths) from memory operands when doing so helps merging. This is done by extending func_checker::compare_operand to record mismatched pairs and then processing this in sem_function::equals_private when comparsion suceeds.
This is controlled by -fipa-icf-alias-sets The patch drops too early, so we may end up processing function twice. Also if merging is not performed we lose code quality for no win (this is rare case). My original plan was to remember the mismatched parameter and apply them only after merging decisions are finished, but I was not sure how to do that in ipa-icf. In particular we need to ensure transitivity. In particular if function foo is merged to bar, we also need to be sure that we dropped base alias setsin functions tht are called by bar even if they themselves are not merged. Martin, is there easy way to implement this on top of current ICF? Patch improves ICF code size savings on cc1plus to 1.3% compared to 0.6% before patch and 3% with -fno-strict-aliasing. 6802 functions are merged and 3975 base alias sets are dropped, 3642 originating from hash-table.h. memory stats are: ipa lto gimple in : 1.01 ( 3%) 0.47 ( 12%) 1.56 ( 4%) 228M ( 14%) tree operand scan : 0.40 ( 1%) 0.16 ( 4%) 0.49 ( 1%) 39M ( 2%) ipa icf : 1.96 ( 6%) 0.20 ( 5%) 2.14 ( 6%) 12M ( 1%) TOTAL : 31.99 3.99 36.07 1632M compared to --disable-ipa-icf: TOTAL : 26.52 2.44 29.21 1354M To recover more of -fno-strict-alias-analysis we could have -fipa-icf-alias-sets 4-state -fipa-icf-alias-sets=0 do not drop any TBAA -fipa-icf-alias-sets=1 drop base alias sets -fipa-icf-alias-sets=2 drop pointer ref alias sets -fipa-icf-alias-sets=3 also drop ref alias sets to 0 for completely incompatible types. On GCC most common reason is now diference in pointer types. so =2 should get the 3% of code size. Bootstrapped/regtested x86_64-linux. OK for mainline? I would be also fine with making this wait for next stage1 if that looks too intrusive (with part 3 of series at least icf is not consuming a lot of memory for nothing), but there ought to be again very nice savings on libreoffice which I think had double-digit saving for icf (13% if I recall correctly). I am busy tomorrow but will start looking into firefox, clang and libreoffice builds again on wednesday. Honza * invoke.texi (-fipa-icf-alias-sets): Document. * common.opt (ipa-icf-alias-sets): New flag. * ipa-icf-gimple.c (func_checker::func_checker): Initialize m_compare_base_alias_sets. (func_checker::hash_operand): Pass m_compare_base_alias_sets to hash_ao_ref. (func_checker::compare_operand): If asked to record mismatched base alias sets. * ipa-icf-gimple.h (func_checker::m_compare_base_alias_sets): New variable. (func_checker::func_checker): Update prototype. (func_checker::m_mismatched_base_alias_set): New variable. * ipa-icf.c: Include ipa-modref-tree.h and ipa-modref.h (drop_base_alias_sets): New function. (sem_function::equals_private): Use it. (sem_function::hash_stmt): Only record base alias sets if asked to. * ipa-modref.c (ipa_modref_analyze): New function. * ipa-modref.h (ipa_modref_analyze): Declare. * tree-ssa-alias-compare.h (ao_compare::hash_ao_ref): Update prototype. * tree-ssa-alias.c (ao_compare::compare_ao_refs): Add base_alias_set argument. diff --git a/gcc/common.opt b/gcc/common.opt index 9552cebe0d6..a5178bfa7e5 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1857,6 +1857,10 @@ fipa-icf-functions Common Report Var(flag_ipa_icf_functions) Optimization Perform Identical Code Folding for functions. +fipa-icf-alias-sets +Common Report Var(flag_ipa_icf_alias_sets) Init(1) Optimization +Perform Identical Code Folding even if base alias set info needs to be eliminated + fipa-icf-variables Common Report Var(flag_ipa_icf_variables) Optimization Perform Identical Code Folding for variables. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 322ec357bb6..76bb4862431 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -11043,6 +11043,13 @@ equivalences that are found only by GCC and equivalences found only by Gold. This flag is enabled by default at @option{-O2} and @option{-Os}. +@item -fipa-icf-alias-sets +@opindex fipa-icf-alias-sets +Perform Identical Code Folding of memory accesses that are not complatible +according to type based alias analysis rules. This implies removing the +type information from the accesses possibly degrading optimization. +Enabled by default. + @item -flive-patching=@var{level} @opindex flive-patching Control GCC's optimizations to produce output suitable for live-patching. diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c index ffb1baddbdb..e05d14d9c17 100644 --- a/gcc/ipa-icf-gimple.c +++ b/gcc/ipa-icf-gimple.c @@ -54,12 +54,14 @@ namespace ipa_icf_gimple { func_checker::func_checker (tree source_func_decl, tree target_func_decl, bool ignore_labels, bool tbaa, + bool compare_base_alias_sets, hash_set<symtab_node *> *ignored_source_nodes, hash_set<symtab_node *> *ignored_target_nodes) : m_source_func_decl (source_func_decl), m_target_func_decl (target_func_decl), m_ignored_source_nodes (ignored_source_nodes), m_ignored_target_nodes (ignored_target_nodes), - m_ignore_labels (ignore_labels), m_tbaa (tbaa) + m_ignore_labels (ignore_labels), m_tbaa (tbaa), + m_compare_base_alias_sets (compare_base_alias_sets) { function *source_func = DECL_STRUCT_FUNCTION (source_func_decl); function *target_func = DECL_STRUCT_FUNCTION (target_func_decl); @@ -259,7 +261,8 @@ func_checker::hash_operand (const_tree arg, inchash::hash &hstate, { ao_ref ref; ao_ref_init (&ref, const_cast <tree> (arg)); - return hash_ao_ref (&ref, lto_streaming_expected_p (), m_tbaa, hstate); + return hash_ao_ref (&ref, lto_streaming_expected_p (), m_tbaa, + m_compare_base_alias_sets, hstate); } else return hash_operand (arg, hstate, flags); @@ -335,18 +338,28 @@ func_checker::compare_operand (tree t1, tree t2, operand_access_type access) if (flags & SEMANTICS) return return_false_with_msg ("compare_ao_refs failed (semantic difference)"); - if (flags & BASE_ALIAS_SET) - return return_false_with_msg - ("compare_ao_refs failed (base alias set difference)"); if (flags & REF_ALIAS_SET) return return_false_with_msg ("compare_ao_refs failed (ref alias set difference)"); - if (flags & ACCESS_PATH) - return return_false_with_msg - ("compare_ao_refs failed (access path difference)"); if (flags & DEPENDENCE_CLIQUE) return return_false_with_msg ("compare_ao_refs failed (dependence clique difference)"); + if (!m_compare_base_alias_sets) + { + if (ao_ref_base_alias_ptr_type (&ref1) + != ao_ref_alias_ptr_type (&ref1)) + m_mismatched_base_alias_set.safe_push (t1); + if (ao_ref_base_alias_ptr_type (&ref2) + != ao_ref_alias_ptr_type (&ref2)) + m_mismatched_base_alias_set.safe_push (t2); + return true; + } + if (flags & BASE_ALIAS_SET) + return return_false_with_msg + ("compare_ao_refs failed (base alias set difference)"); + if (flags & ACCESS_PATH) + return return_false_with_msg + ("compare_ao_refs failed (access path difference)"); gcc_unreachable (); } else diff --git a/gcc/ipa-icf-gimple.h b/gcc/ipa-icf-gimple.h index 40f7474738e..9745fdd9278 100644 --- a/gcc/ipa-icf-gimple.h +++ b/gcc/ipa-icf-gimple.h @@ -125,7 +125,7 @@ public: func_checker (): m_source_func_decl (NULL_TREE), m_target_func_decl (NULL_TREE), m_ignored_source_nodes (NULL), m_ignored_target_nodes (NULL), - m_ignore_labels (false), m_tbaa (true) + m_ignore_labels (false), m_tbaa (true), m_compare_base_alias_sets (NULL) { m_source_ssa_names.create (0); m_target_ssa_names.create (0); @@ -140,6 +140,7 @@ public: func_checker (tree source_func_decl, tree target_func_decl, bool ignore_labels = false, bool tbaa = true, + bool compare_base_alias_sets = true, hash_set<symtab_node *> *ignored_source_nodes = NULL, hash_set<symtab_node *> *ignored_target_nodes = NULL); @@ -243,6 +244,9 @@ public: /* Return access type of a given operand. */ static operand_access_type get_operand_access_type (operand_access_type_map *map, tree); + + /* Memory operands that have conficting alias sets. */ + auto_vec<tree> m_mismatched_base_alias_set; private: /* Vector mapping source SSA names to target ones. */ vec <int> m_source_ssa_names; @@ -264,7 +268,6 @@ private: declaration comparison. */ hash_set<symtab_node *> *m_ignored_target_nodes; - /* Source to target edge map. */ hash_map <edge, edge> m_edge_map; /* Source to target declaration map. */ @@ -279,6 +282,8 @@ private: /* Flag if we should compare type based alias analysis info. */ bool m_tbaa; + /* Flag if we should compare base alias sets. */ + bool m_compare_base_alias_sets; public: /* Return true if two operands are equal. The flags fields can be used to specify OEP flags described above. */ diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c index 27eeda3a319..33fbb58ebc6 100644 --- a/gcc/ipa-icf.c +++ b/gcc/ipa-icf.c @@ -88,6 +88,8 @@ along with GCC; see the file COPYING3. If not see #include "tree-vector-builder.h" #include "symtab-thunks.h" #include "alias.h" +#include "ipa-modref-tree.h" +#include "ipa-modref.h" using namespace ipa_icf_gimple; @@ -823,6 +825,72 @@ sem_function::equals (sem_item *item, return eq; } +/* Drop base alias sets in function DECL for all operands in + MISMATCHED_BASE_ALIAS_SET_OPS. */ + +static void +drop_base_alias_sets (tree decl, + hash_set <tree> &mismatched_base_alias_set_ops) +{ + function *func = DECL_STRUCT_FUNCTION (decl); + unsigned int n = 0; + basic_block bb; + + FOR_EACH_BB_FN (bb, func) + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); + gsi_next (&gsi)) + for (unsigned i = 0; i < gimple_num_ops (gsi_stmt (gsi)); ++i) + if (mismatched_base_alias_set_ops.contains + (gimple_op (gsi_stmt (gsi), i))) + { + tree *op = gimple_op_ptr (gsi_stmt (gsi), i); + ao_ref ref; + ao_ref_init (&ref, *op); + n++; + tree ptrtype = ao_ref_alias_ptr_type (&ref); + + if (dump_enabled_p ()) + { + dump_user_location_t loc (gsi_stmt (gsi)); + dump_printf_loc (MSG_NOTE, loc, + "Dropping base alias set to enable" + " identical code folding.\n"); + } + + while (handled_component_p (*op)) + op = &TREE_OPERAND (*op, 0); + + /* Adjust MEM_REF. This will also make it view + converted so we will drop access path. */ + if (TREE_CODE (*op) == MEM_REF + || TREE_CODE (*op) == TARGET_MEM_REF) + { + TREE_OPERAND (*op, 1) + = fold_convert (ptrtype, TREE_OPERAND (*op, 1)); + } + /* If there is no MEM_REF introduce new one. */ + else + { + tree t = *op; + + *op = build2 (MEM_REF, TREE_TYPE (t), + build1 (ADDR_EXPR, ptrtype, t), + build_int_cst (ptrtype, 0)); + TREE_THIS_VOLATILE (*op) = TREE_THIS_VOLATILE (t); + } + if (flag_checking) + { + ao_ref_init (&ref, gimple_op (gsi_stmt (gsi), i)); + gcc_assert (ao_ref_alias_ptr_type (&ref) + == ao_ref_base_alias_ptr_type (&ref)); + gcc_assert (ao_ref_alias_set (&ref) + == ao_ref_base_alias_set (&ref)); + } + } + if (n) + ipa_modref_analyze (cgraph_node::get (decl)); +} + /* Processes function equality comparison. */ bool @@ -850,6 +918,8 @@ sem_function::equals_private (sem_item *item) false, opt_for_fn (m_compared_func->decl, flag_strict_aliasing), + !opt_for_fn (m_compared_func->decl, + flag_ipa_icf_alias_sets), &refs_set, &m_compared_func->refs_set); arg1 = DECL_ARGUMENTS (decl); @@ -920,6 +990,32 @@ sem_function::equals_private (sem_item *item) if (!compare_phi_node (bb_sorted[i]->bb, m_compared_func->bb_sorted[i]->bb)) return return_false_with_msg ("PHI node comparison returns false"); + /* If necessary walk alias body and drop mase alias sets. */ + if (result && m_checker->m_mismatched_base_alias_set.length ()) + { + hash_set <tree> mismatched_base_alias_set_ops; + /* Record mismatched operands. */ + for (unsigned i = 0; i < m_checker->m_mismatched_base_alias_set.length (); i++) + { + tree op = m_checker->m_mismatched_base_alias_set [i]; + mismatched_base_alias_set_ops.add (op); + } + /* Drop in both bodies; it is not clear what body will win in the + merging and thus process both. + + TODO: It is possible that merging will fail. + In that case we lose alias info for no win. This is bit hard to track + because proving two function equal also enables merging of callers + of the functions. + + We can also keep the body with alias set around for inlining, + but again this requires quite careful tracking of the transitive + closure of all merges. */ + drop_base_alias_sets (decl, mismatched_base_alias_set_ops); + drop_base_alias_sets (m_compared_func->decl, + mismatched_base_alias_set_ops); + } + return result; } @@ -1461,9 +1557,12 @@ sem_function::hash_stmt (gimple *stmt, inchash::hash &hstate) tree t = ao_ref_alias_ptr_type (&ref); if (variably_modified_type_p (t, NULL_TREE)) memory_access_types.safe_push (t); - t = ao_ref_base_alias_ptr_type (&ref); - if (variably_modified_type_p (t, NULL_TREE)) - memory_access_types.safe_push (t); + if (!flag_ipa_icf_alias_sets) + { + t = ao_ref_base_alias_ptr_type (&ref); + if (variably_modified_type_p (t, NULL_TREE)) + memory_access_types.safe_push (t); + } } } /* Consider nocf_check attribute in hash as it affects code diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c index 155bf59fc53..b09a2fcb7a2 100644 --- a/gcc/ipa-modref.c +++ b/gcc/ipa-modref.c @@ -2117,6 +2117,30 @@ modref_summaries_lto::insert (struct cgraph_node *node, modref_summary_lto *) pop_cfun (); } +/* Recompute summary after body was updated. */ + +void +ipa_modref_analyze (struct cgraph_node *node) +{ + if (optimization_summaries + && optimization_summaries->get (node)) + { + optimization_summaries->remove (node); + optimization_summaries->insert + (node, optimization_summaries->get_create (node)); + } + if (summaries && summaries->get (node)) + { + summaries->remove (node); + summaries->insert (node, summaries->get_create (node)); + } + if (summaries_lto && summaries_lto->get (node)) + { + summaries_lto->remove (node); + summaries_lto->insert (node, summaries_lto->get_create (node)); + } +} + /* Called when new clone is inserted to callgraph late. */ void diff --git a/gcc/ipa-modref.h b/gcc/ipa-modref.h index 7decabd5744..982c620468a 100644 --- a/gcc/ipa-modref.h +++ b/gcc/ipa-modref.h @@ -41,5 +41,6 @@ struct GTY(()) modref_summary modref_summary *get_modref_function_summary (cgraph_node *func); void ipa_modref_c_finalize (); void ipa_merge_modref_summary_after_inlining (cgraph_edge *e); +void ipa_modref_analyze (cgraph_node *node); #endif diff --git a/gcc/tree-ssa-alias-compare.h b/gcc/tree-ssa-alias-compare.h index 0e8409a7565..13bfa3381b5 100644 --- a/gcc/tree-ssa-alias-compare.h +++ b/gcc/tree-ssa-alias-compare.h @@ -37,7 +37,7 @@ class ao_compare : public operand_compare int compare_ao_refs (ao_ref *ref1, ao_ref *ref2, bool lto_streaming_safe, bool tbaa); void hash_ao_ref (ao_ref *ref, bool lto_streaming_safe, bool tbaa, - inchash::hash &hstate); + bool base_alias_set, inchash::hash &hstate); }; #endif diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c index 5ebbb087285..52dd0055905 100644 --- a/gcc/tree-ssa-alias.c +++ b/gcc/tree-ssa-alias.c @@ -4169,11 +4169,14 @@ ao_compare::compare_ao_refs (ao_ref *ref1, ao_ref *ref2, return flags; } -/* Hash REF to HSTATE. If LTO_STREAMING_SAFE do not use alias sets +/* Hash REF to HSTATE. If TBAA is false do not hash info relevant + for type based alias anslysi. If BASE_ALIS_SET is false + do not hash base alias set and the access path. + If LTO_STREAMING_SAFE do not use alias sets and canonical types. */ void ao_compare::hash_ao_ref (ao_ref *ref, bool lto_streaming_safe, bool tbaa, - inchash::hash &hstate) + bool base_alias_set, inchash::hash &hstate) { tree base = ao_ref_base (ref); tree tbase = base; @@ -4209,6 +4212,7 @@ ao_compare::hash_ao_ref (ao_ref *ref, bool lto_streaming_safe, bool tbaa, if (!lto_streaming_safe && tbaa) { hstate.add_int (ao_ref_alias_set (ref)); - hstate.add_int (ao_ref_base_alias_set (ref)); + if (base_alias_set) + hstate.add_int (ao_ref_base_alias_set (ref)); } }