Hi, we discussed this patch briefly two weeks ago, but did not reach conclusion and since I wanted to avoid ICF fixes slipping another release I had chance to return to it only now. Main limitation of modref is the fact that it does not track anything in memory. This is intentional - I wanted the initial implementation to be cheap. However it also makes it very limited when it comes to detecting noescape especially because it is paranoid about what memory accesses may be used to copy (bits of) pointers.
Consider: void test (int *a, int *b) { *a=*b; } Here both parameters are noescape. However we get: Analyzing flags of ssa name: a_4(D) Analyzing stmt:*a_4(D) = _1; current flags of a_4(D) direct noescape flags of ssa name a_4(D) direct noescape Analyzing flags of ssa name: b_3(D) Analyzing stmt:_1 = *b_3(D); Analyzing flags of ssa name: _1 Analyzing stmt:*a_4(D) = _1; ssa name saved to memory current flags of _1 flags of ssa name _1 current flags of b_3(D) So for a we get flags right, but for b we see memory write and stop trakcing completely assuming that the memory may cause indirect scape. This patch adds EAF_NODIRECTSCAPE that is weaker vairant of EAF_NOESCAPE where we only know that the pointer itself does not escape, but memory pointed to may. This is a lot more reliable to auto-detect that EAF_NOESCAPE and still enables additional optimization. With patch we get nodirectscape flag for b that enables in practice similar optimization as EAF_NOESCAPE for arrays of integers that points nowhere :) Path is very effective on cc1plus changing: Alias oracle query stats: refs_may_alias_p: 65974098 disambiguations, 75491744 queries ref_maybe_used_by_call_p: 239316 disambiguations, 66783365 queries call_may_clobber_ref_p: 109214 disambiguations, 114381 queries nonoverlapping_component_refs_p: 0 disambiguations, 37014 queries nonoverlapping_refs_since_match_p: 26917 disambiguations, 56947 must overlaps, 84634 queries aliasing_component_refs_p: 63593 disambiguations, 2026642 queries TBAA oracle: 25059985 disambiguations 58735771 queries 12279288 are in alias set 0 10228328 queries asked about the same object 124 queries asked about the same alias set 0 access volatile 9551512 are dependent in the DAG 1616534 are aritificially in conflict with void * Modref stats: modref use: 13629 disambiguations, 362550 queries modref clobber: 1603074 disambiguations, 12633002 queries 4128405 tbaa queries (0.326795 per modref query) 678007 base compares (0.053670 per modref query) PTA query stats: pt_solution_includes: 1447025 disambiguations, 13421154 queries pt_solutions_intersect: 1014606 disambiguations, 12743264 queries to: Alias oracle query stats: refs_may_alias_p: 76994196 disambiguations, 86322026 queries ref_maybe_used_by_call_p: 398635 disambiguations, 77664397 queries call_may_clobber_ref_p: 248995 disambiguations, 252747 queries nonoverlapping_component_refs_p: 0 disambiguations, 36357 queries nonoverlapping_refs_since_match_p: 26973 disambiguations, 56944 must overlaps, 84688 queries aliasing_component_refs_p: 63472 disambiguations, 2013517 queries TBAA oracle: 25278106 disambiguations 59186830 queries 12480044 are in alias set 0 10260217 queries asked about the same object 121 queries asked about the same alias set 0 access volatile 9550119 are dependent in the DAG 1618223 are aritificially in conflict with void * Modref stats: modref use: 13909 disambiguations, 370418 queries modref clobber: 1643513 disambiguations, 18036536 queries 4197648 tbaa queries (0.232730 per modref query) 727893 base compares (0.040357 per modref query) PTA query stats: pt_solution_includes: 11463123 disambiguations, 22989602 queries pt_solutions_intersect: 1238048 disambiguations, 12893812 queries This is 1447025->11463123 PTA disambiguations, so 7.6 times more. (there is also incrase in number of querries) For tramp3d I get: Alias oracle query stats: refs_may_alias_p: 2394105 disambiguations, 2675969 queries ref_maybe_used_by_call_p: 11048 disambiguations, 2428198 queries call_may_clobber_ref_p: 922 disambiguations, 932 queries nonoverlapping_component_refs_p: 0 disambiguations, 4457 queries nonoverlapping_refs_since_match_p: 329 disambiguations, 10298 must overlaps, 10714 queries aliasing_component_refs_p: 956 disambiguations, 36074 queries TBAA oracle: 1046044 disambiguations 1942025 queries 169583 are in alias set 0 507146 queries asked about the same object 0 queries asked about the same alias set 0 access volatile 218937 are dependent in the DAG 315 are aritificially in conflict with void * Modref stats: modref use: 1324 disambiguations, 5833 queries modref clobber: 36497 disambiguations, 110898 queries 144060 tbaa queries (1.299032 per modref query) 22514 base compares (0.203015 per modref query) PTA query stats: pt_solution_includes: 401457 disambiguations, 609088 queries pt_solutions_intersect: 138800 disambiguations, 417703 queries Alias oracle query stats: refs_may_alias_p: 2667557 disambiguations, 2933558 queries ref_maybe_used_by_call_p: 15330 disambiguations, 2699662 queries call_may_clobber_ref_p: 1707 disambiguations, 1717 queries nonoverlapping_component_refs_p: 0 disambiguations, 3592 queries nonoverlapping_refs_since_match_p: 303 disambiguations, 9079 must overlaps, 9413 queries aliasing_component_refs_p: 825 disambiguations, 31791 queries TBAA oracle: 1061409 disambiguations 1975798 queries 173463 are in alias set 0 513959 queries asked about the same object 0 queries asked about the same alias set 0 access volatile 226652 are dependent in the DAG 315 are aritificially in conflict with void * Modref stats: modref use: 1401 disambiguations, 8098 queries modref clobber: 38706 disambiguations, 348934 queries 154297 tbaa queries (0.442195 per modref query) 27254 base compares (0.078106 per modref query) PTA query stats: pt_solution_includes: 549517 disambiguations, 714911 queries pt_solutions_intersect: 146941 disambiguations, 420459 queries So 37% PTA disambiguations and 11% overall lto-bootstrapped/regtested x86_64-linux and I also run SPEC benchmarks https://lnt.opensuse.org/db_default/v4/SPEC/latest_runs_report?younger_in_days=14&older_in_days=0&min_percentage_change=0.02&revisions=e4360e452b4c6cd56d4e21663703e920763413f5%2C94b9afeac33475566c27cf9458e06480ee06b8e5&include_user_branches=on https://lnt.opensuse.org/db_default/v4/CPP/latest_runs_report?younger_in_days=14&older_in_days=0&min_percentage_change=0.02&revisions=e4360e452b4c6cd56d4e21663703e920763413f5%2C94b9afeac33475566c27cf9458e06480ee06b8e5&include_user_branches=on Does it make sense and would it be still OK for trunk? It can wait for next stage1 but it seems to achieve quite nice improvemnt with relatively easy patch. * gimple.c (gimple_call_arg_flags): Also imply EAF_NODIRECTESCAPE. * tree-core.h (EAF_NODRECTESCAPE): New flag. * tree-ssa-structalias.c (make_indirect_escape_constraint): New function. (handle_rhs_call): Hanlde EAF_NODIRECTESCAPE. * ipa-modref.c (dump_eaf_flags): Print EAF_NODIRECTESCAPE. (deref_flags): Dereference is always EAF_NODIRECTESCAPE. (modref_lattice::init): Also set EAF_NODIRECTESCAPE. (analyze_ssa_name_flags): Pure functions do not affect EAF_NODIRECTESCAPE. (analyze_params): Likewise. (ipa_merge_modref_summary_after_inlining): Likewise. (modref_merge_call_site_flags): Likewise. diff --git a/gcc/gimple.c b/gcc/gimple.c index e3e508daf2f..e8246b72cc9 100644 --- a/gcc/gimple.c +++ b/gcc/gimple.c @@ -1543,7 +1543,7 @@ gimple_call_arg_flags (const gcall *stmt, unsigned arg) if (fnspec.arg_direct_p (arg)) flags |= EAF_DIRECT; if (fnspec.arg_noescape_p (arg)) - flags |= EAF_NOESCAPE; + flags |= EAF_NOESCAPE | EAF_NODIRECTESCAPE; if (fnspec.arg_readonly_p (arg)) flags |= EAF_NOCLOBBER; } diff --git a/gcc/tree-core.h b/gcc/tree-core.h index 313a6af2253..e457b917b98 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -110,6 +110,10 @@ struct die_struct; /* Nonzero if the argument is not used by the function. */ #define EAF_UNUSED (1 << 3) +/* Nonzero if the argument itself does not escape but memory + referenced by it can escape. */ +#define EAF_NODIRECTESCAPE (1 << 4) + /* Call return flags. */ /* Mask for the argument number that is returned. Lower two bits of the return flags, encodes argument slots zero to three. */ diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c index a4832b75436..9f4de96d544 100644 --- a/gcc/tree-ssa-structalias.c +++ b/gcc/tree-ssa-structalias.c @@ -3851,6 +3851,23 @@ make_escape_constraint (tree op) make_constraint_to (escaped_id, op); } +/* Make constraint necessary to make all indirect references + from VI escape. */ + +static void +make_indirect_escape_constraint (varinfo_t vi) +{ + struct constraint_expr lhs, rhs; + /* escaped = *(VAR + UNKNOWN); */ + lhs.type = SCALAR; + lhs.var = escaped_id; + lhs.offset = 0; + rhs.type = DEREF; + rhs.var = vi->id; + rhs.offset = UNKNOWN_OFFSET; + process_constraint (new_constraint (lhs, rhs)); +} + /* Add constraints to that the solution of VI is transitively closed. */ static void @@ -4026,7 +4043,7 @@ handle_rhs_call (gcall *stmt, vec<ce_s> *results) set. The argument would still get clobbered through the escape solution. */ if ((flags & EAF_NOCLOBBER) - && (flags & EAF_NOESCAPE)) + && (flags & (EAF_NOESCAPE | EAF_NODIRECTESCAPE))) { varinfo_t uses = get_call_use_vi (stmt); varinfo_t tem = new_var_info (NULL_TREE, "callarg", true); @@ -4036,9 +4053,11 @@ handle_rhs_call (gcall *stmt, vec<ce_s> *results) if (!(flags & EAF_DIRECT)) make_transitive_closure_constraints (tem); make_copy_constraint (uses, tem->id); + if (!(flags & (EAF_NOESCAPE | EAF_DIRECT))) + make_indirect_escape_constraint (tem); returns_uses = true; } - else if (flags & EAF_NOESCAPE) + else if (flags & (EAF_NOESCAPE | EAF_NODIRECTESCAPE)) { struct constraint_expr lhs, rhs; varinfo_t uses = get_call_use_vi (stmt); @@ -4061,6 +4080,8 @@ handle_rhs_call (gcall *stmt, vec<ce_s> *results) rhs.var = nonlocal_id; rhs.offset = 0; process_constraint (new_constraint (lhs, rhs)); + if (!(flags & (EAF_NOESCAPE | EAF_DIRECT))) + make_indirect_escape_constraint (tem); returns_uses = true; } else diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c index e6cb4a87b69..8305393e3ca 100644 --- a/gcc/ipa-modref.c +++ b/gcc/ipa-modref.c @@ -151,6 +151,8 @@ dump_eaf_flags (FILE *out, int flags, bool newline = true) fprintf (out, " noclobber"); if (flags & EAF_NOESCAPE) fprintf (out, " noescape"); + if (flags & EAF_NODIRECTESCAPE) + fprintf (out, " nodirectescape"); if (flags & EAF_UNUSED) fprintf (out, " unused"); if (newline) @@ -1303,7 +1305,7 @@ memory_access_to (tree op, tree ssa_name) static int deref_flags (int flags, bool ignore_stores) { - int ret = 0; + int ret = EAF_NODIRECTESCAPE; if (flags & EAF_UNUSED) ret |= EAF_DIRECT | EAF_NOCLOBBER | EAF_NOESCAPE; else @@ -1361,7 +1363,8 @@ public: void modref_lattice::init () { - flags = EAF_DIRECT | EAF_NOCLOBBER | EAF_NOESCAPE | EAF_UNUSED; + flags = EAF_DIRECT | EAF_NOCLOBBER | EAF_NOESCAPE | EAF_UNUSED + | EAF_NODIRECTESCAPE; open = true; known = false; } @@ -1653,7 +1656,8 @@ analyze_ssa_name_flags (tree name, vec<modref_lattice> &lattice, int depth, { int call_flags = gimple_call_arg_flags (call, i); if (ignore_stores) - call_flags |= EAF_NOCLOBBER | EAF_NOESCAPE; + call_flags |= EAF_NOCLOBBER | EAF_NOESCAPE + | EAF_NODIRECTESCAPE; if (!record_ipa) lattice[index].merge (call_flags); @@ -1829,7 +1833,7 @@ analyze_parms (modref_summary *summary, modref_summary_lto *summary_lto, /* For pure functions we have implicit NOCLOBBER and NOESCAPE. */ if (ecf_flags & ECF_PURE) - flags &= ~(EAF_NOCLOBBER | EAF_NOESCAPE); + flags &= ~(EAF_NOCLOBBER | EAF_NOESCAPE | EAF_NODIRECTESCAPE); if (flags) { @@ -3098,7 +3102,7 @@ ipa_merge_modref_summary_after_inlining (cgraph_edge *edge) if (!ee->direct) flags = deref_flags (flags, ignore_stores); else if (ignore_stores) - flags |= EAF_NOCLOBBER | EAF_NOESCAPE; + flags |= EAF_NOCLOBBER | EAF_NOESCAPE | EAF_NODIRECTESCAPE; flags |= ee->min_flags; to_info->arg_flags[ee->parm_index] &= flags; if (to_info->arg_flags[ee->parm_index]) @@ -3112,7 +3116,7 @@ ipa_merge_modref_summary_after_inlining (cgraph_edge *edge) if (!ee->direct) flags = deref_flags (flags, ignore_stores); else if (ignore_stores) - flags |= EAF_NOCLOBBER | EAF_NOESCAPE; + flags |= EAF_NOCLOBBER | EAF_NOESCAPE | EAF_NODIRECTESCAPE; flags |= ee->min_flags; to_info_lto->arg_flags[ee->parm_index] &= flags; if (to_info_lto->arg_flags[ee->parm_index]) @@ -3623,8 +3627,8 @@ modref_merge_call_site_flags (escape_summary *sum, } else if (ignore_stores) { - flags |= EAF_NOESCAPE | EAF_NOCLOBBER; - flags_lto |= EAF_NOESCAPE | EAF_NOCLOBBER; + flags |= EAF_NOESCAPE | EAF_NOCLOBBER | EAF_NODIRECTESCAPE; + flags_lto |= EAF_NOESCAPE | EAF_NOCLOBBER | EAF_NODIRECTESCAPE; } flags |= ee->min_flags; flags_lto |= ee->min_flags;