On Wed, 25 Nov 2020, Jan Hubicka wrote: > Hi, > we discussed this patch briefly two weeks ago, but did not reach conclusion > and > since I wanted to avoid ICF fixes slipping another release I had chance to > return to it only now. Main limitation of modref is the fact that it does not > track anything in memory. This is intentional - I wanted the initial > implementation to be cheap. However it also makes it very limited when it > comes > to detecting noescape especially because it is paranoid about what memory > accesses may be used to copy (bits of) pointers. > > Consider: > > void > test (int *a, int *b) > { > *a=*b; > } > > Here both parameters are noescape. However we get: > > Analyzing flags of ssa name: a_4(D) > Analyzing stmt:*a_4(D) = _1; > current flags of a_4(D) direct noescape > flags of ssa name a_4(D) direct noescape > Analyzing flags of ssa name: b_3(D) > Analyzing stmt:_1 = *b_3(D); > Analyzing flags of ssa name: _1 > Analyzing stmt:*a_4(D) = _1; > ssa name saved to memory > current flags of _1 > flags of ssa name _1 > current flags of b_3(D) > > So for a we get flags right, but for b we see memory write and stop trakcing > completely assuming that the memory may cause indirect scape. > > This patch adds EAF_NODIRECTSCAPE that is weaker vairant of EAF_NOESCAPE where > we only know that the pointer itself does not escape, but memory pointed to > may. This is a lot more reliable to auto-detect that EAF_NOESCAPE and still > enables additional optimization. With patch we get nodirectscape flag for b > that enables in practice similar optimization as EAF_NOESCAPE for arrays of > integers that points nowhere :) > > Path is very effective on cc1plus changing: > > Alias oracle query stats: > refs_may_alias_p: 65974098 disambiguations, 75491744 queries > ref_maybe_used_by_call_p: 239316 disambiguations, 66783365 queries > call_may_clobber_ref_p: 109214 disambiguations, 114381 queries > nonoverlapping_component_refs_p: 0 disambiguations, 37014 queries > nonoverlapping_refs_since_match_p: 26917 disambiguations, 56947 must > overlaps, 84634 queries > aliasing_component_refs_p: 63593 disambiguations, 2026642 queries > TBAA oracle: 25059985 disambiguations 58735771 queries > 12279288 are in alias set 0 > 10228328 queries asked about the same object > 124 queries asked about the same alias set > 0 access volatile > 9551512 are dependent in the DAG > 1616534 are aritificially in conflict with void * > > Modref stats: > modref use: 13629 disambiguations, 362550 queries > modref clobber: 1603074 disambiguations, 12633002 queries > 4128405 tbaa queries (0.326795 per modref query) > 678007 base compares (0.053670 per modref query) > > PTA query stats: > pt_solution_includes: 1447025 disambiguations, 13421154 queries > pt_solutions_intersect: 1014606 disambiguations, 12743264 queries > > to: > > Alias oracle query stats: > refs_may_alias_p: 76994196 disambiguations, 86322026 queries > ref_maybe_used_by_call_p: 398635 disambiguations, 77664397 queries > call_may_clobber_ref_p: 248995 disambiguations, 252747 queries > nonoverlapping_component_refs_p: 0 disambiguations, 36357 queries > nonoverlapping_refs_since_match_p: 26973 disambiguations, 56944 must > overlaps, 84688 queries > aliasing_component_refs_p: 63472 disambiguations, 2013517 queries > TBAA oracle: 25278106 disambiguations 59186830 queries > 12480044 are in alias set 0 > 10260217 queries asked about the same object > 121 queries asked about the same alias set > 0 access volatile > 9550119 are dependent in the DAG > 1618223 are aritificially in conflict with void * > > Modref stats: > modref use: 13909 disambiguations, 370418 queries > modref clobber: 1643513 disambiguations, 18036536 queries > 4197648 tbaa queries (0.232730 per modref query) > 727893 base compares (0.040357 per modref query) > > PTA query stats: > pt_solution_includes: 11463123 disambiguations, 22989602 queries > pt_solutions_intersect: 1238048 disambiguations, 12893812 queries > > This is 1447025->11463123 PTA disambiguations, so 7.6 times more. > (there is also incrase in number of querries) > > For tramp3d I get: > Alias oracle query stats: > refs_may_alias_p: 2394105 disambiguations, 2675969 queries > ref_maybe_used_by_call_p: 11048 disambiguations, 2428198 queries > call_may_clobber_ref_p: 922 disambiguations, 932 queries > nonoverlapping_component_refs_p: 0 disambiguations, 4457 queries > nonoverlapping_refs_since_match_p: 329 disambiguations, 10298 must > overlaps, 10714 queries > aliasing_component_refs_p: 956 disambiguations, 36074 queries > TBAA oracle: 1046044 disambiguations 1942025 queries > 169583 are in alias set 0 > 507146 queries asked about the same object > 0 queries asked about the same alias set > 0 access volatile > 218937 are dependent in the DAG > 315 are aritificially in conflict with void * > > Modref stats: > modref use: 1324 disambiguations, 5833 queries > modref clobber: 36497 disambiguations, 110898 queries > 144060 tbaa queries (1.299032 per modref query) > 22514 base compares (0.203015 per modref query) > > PTA query stats: > pt_solution_includes: 401457 disambiguations, 609088 queries > pt_solutions_intersect: 138800 disambiguations, 417703 queries > > > Alias oracle query stats: > refs_may_alias_p: 2667557 disambiguations, 2933558 queries > ref_maybe_used_by_call_p: 15330 disambiguations, 2699662 queries > call_may_clobber_ref_p: 1707 disambiguations, 1717 queries > nonoverlapping_component_refs_p: 0 disambiguations, 3592 queries > nonoverlapping_refs_since_match_p: 303 disambiguations, 9079 must overlaps, > 9413 queries > aliasing_component_refs_p: 825 disambiguations, 31791 queries > TBAA oracle: 1061409 disambiguations 1975798 queries > 173463 are in alias set 0 > 513959 queries asked about the same object > 0 queries asked about the same alias set > 0 access volatile > 226652 are dependent in the DAG > 315 are aritificially in conflict with void * > > Modref stats: > modref use: 1401 disambiguations, 8098 queries > modref clobber: 38706 disambiguations, 348934 queries > 154297 tbaa queries (0.442195 per modref query) > 27254 base compares (0.078106 per modref query) > > PTA query stats: > pt_solution_includes: 549517 disambiguations, 714911 queries > pt_solutions_intersect: 146941 disambiguations, 420459 queries > > So 37% PTA disambiguations and 11% overall > > lto-bootstrapped/regtested x86_64-linux and I also run SPEC benchmarks > https://lnt.opensuse.org/db_default/v4/SPEC/latest_runs_report?younger_in_days=14&older_in_days=0&min_percentage_change=0.02&revisions=e4360e452b4c6cd56d4e21663703e920763413f5%2C94b9afeac33475566c27cf9458e06480ee06b8e5&include_user_branches=on > https://lnt.opensuse.org/db_default/v4/CPP/latest_runs_report?younger_in_days=14&older_in_days=0&min_percentage_change=0.02&revisions=e4360e452b4c6cd56d4e21663703e920763413f5%2C94b9afeac33475566c27cf9458e06480ee06b8e5&include_user_branches=on > > Does it make sense and would it be still OK for trunk? > It can wait for next stage1 but it seems to achieve quite nice improvemnt > with relatively easy patch.
Looks OK to me. Can you see to add a testcase that shows what we now can optimize/disambiguate? Thanks, Richard. > * gimple.c (gimple_call_arg_flags): Also imply EAF_NODIRECTESCAPE. > * tree-core.h (EAF_NODRECTESCAPE): New flag. > * tree-ssa-structalias.c (make_indirect_escape_constraint): New > function. > (handle_rhs_call): Hanlde EAF_NODIRECTESCAPE. > * ipa-modref.c (dump_eaf_flags): Print EAF_NODIRECTESCAPE. > (deref_flags): Dereference is always EAF_NODIRECTESCAPE. > (modref_lattice::init): Also set EAF_NODIRECTESCAPE. > (analyze_ssa_name_flags): Pure functions do not affect > EAF_NODIRECTESCAPE. > (analyze_params): Likewise. > (ipa_merge_modref_summary_after_inlining): Likewise. > (modref_merge_call_site_flags): Likewise. > diff --git a/gcc/gimple.c b/gcc/gimple.c > index e3e508daf2f..e8246b72cc9 100644 > --- a/gcc/gimple.c > +++ b/gcc/gimple.c > @@ -1543,7 +1543,7 @@ gimple_call_arg_flags (const gcall *stmt, unsigned arg) > if (fnspec.arg_direct_p (arg)) > flags |= EAF_DIRECT; > if (fnspec.arg_noescape_p (arg)) > - flags |= EAF_NOESCAPE; > + flags |= EAF_NOESCAPE | EAF_NODIRECTESCAPE; > if (fnspec.arg_readonly_p (arg)) > flags |= EAF_NOCLOBBER; > } > diff --git a/gcc/tree-core.h b/gcc/tree-core.h > index 313a6af2253..e457b917b98 100644 > --- a/gcc/tree-core.h > +++ b/gcc/tree-core.h > @@ -110,6 +110,10 @@ struct die_struct; > /* Nonzero if the argument is not used by the function. */ > #define EAF_UNUSED (1 << 3) > > +/* Nonzero if the argument itself does not escape but memory > + referenced by it can escape. */ > +#define EAF_NODIRECTESCAPE (1 << 4) > + > /* Call return flags. */ > /* Mask for the argument number that is returned. Lower two bits of > the return flags, encodes argument slots zero to three. */ > diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c > index a4832b75436..9f4de96d544 100644 > --- a/gcc/tree-ssa-structalias.c > +++ b/gcc/tree-ssa-structalias.c > @@ -3851,6 +3851,23 @@ make_escape_constraint (tree op) > make_constraint_to (escaped_id, op); > } > > +/* Make constraint necessary to make all indirect references > + from VI escape. */ > + > +static void > +make_indirect_escape_constraint (varinfo_t vi) > +{ > + struct constraint_expr lhs, rhs; > + /* escaped = *(VAR + UNKNOWN); */ > + lhs.type = SCALAR; > + lhs.var = escaped_id; > + lhs.offset = 0; > + rhs.type = DEREF; > + rhs.var = vi->id; > + rhs.offset = UNKNOWN_OFFSET; > + process_constraint (new_constraint (lhs, rhs)); > +} > + > /* Add constraints to that the solution of VI is transitively closed. */ > > static void > @@ -4026,7 +4043,7 @@ handle_rhs_call (gcall *stmt, vec<ce_s> *results) > set. The argument would still get clobbered through the > escape solution. */ > if ((flags & EAF_NOCLOBBER) > - && (flags & EAF_NOESCAPE)) > + && (flags & (EAF_NOESCAPE | EAF_NODIRECTESCAPE))) > { > varinfo_t uses = get_call_use_vi (stmt); > varinfo_t tem = new_var_info (NULL_TREE, "callarg", true); > @@ -4036,9 +4053,11 @@ handle_rhs_call (gcall *stmt, vec<ce_s> *results) > if (!(flags & EAF_DIRECT)) > make_transitive_closure_constraints (tem); > make_copy_constraint (uses, tem->id); > + if (!(flags & (EAF_NOESCAPE | EAF_DIRECT))) > + make_indirect_escape_constraint (tem); > returns_uses = true; > } > - else if (flags & EAF_NOESCAPE) > + else if (flags & (EAF_NOESCAPE | EAF_NODIRECTESCAPE)) > { > struct constraint_expr lhs, rhs; > varinfo_t uses = get_call_use_vi (stmt); > @@ -4061,6 +4080,8 @@ handle_rhs_call (gcall *stmt, vec<ce_s> *results) > rhs.var = nonlocal_id; > rhs.offset = 0; > process_constraint (new_constraint (lhs, rhs)); > + if (!(flags & (EAF_NOESCAPE | EAF_DIRECT))) > + make_indirect_escape_constraint (tem); > returns_uses = true; > } > else > diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c > index e6cb4a87b69..8305393e3ca 100644 > --- a/gcc/ipa-modref.c > +++ b/gcc/ipa-modref.c > @@ -151,6 +151,8 @@ dump_eaf_flags (FILE *out, int flags, bool newline = true) > fprintf (out, " noclobber"); > if (flags & EAF_NOESCAPE) > fprintf (out, " noescape"); > + if (flags & EAF_NODIRECTESCAPE) > + fprintf (out, " nodirectescape"); > if (flags & EAF_UNUSED) > fprintf (out, " unused"); > if (newline) > @@ -1303,7 +1305,7 @@ memory_access_to (tree op, tree ssa_name) > static int > deref_flags (int flags, bool ignore_stores) > { > - int ret = 0; > + int ret = EAF_NODIRECTESCAPE; > if (flags & EAF_UNUSED) > ret |= EAF_DIRECT | EAF_NOCLOBBER | EAF_NOESCAPE; > else > @@ -1361,7 +1363,8 @@ public: > void > modref_lattice::init () > { > - flags = EAF_DIRECT | EAF_NOCLOBBER | EAF_NOESCAPE | EAF_UNUSED; > + flags = EAF_DIRECT | EAF_NOCLOBBER | EAF_NOESCAPE | EAF_UNUSED > + | EAF_NODIRECTESCAPE; > open = true; > known = false; > } > @@ -1653,7 +1656,8 @@ analyze_ssa_name_flags (tree name, vec<modref_lattice> > &lattice, int depth, > { > int call_flags = gimple_call_arg_flags (call, i); > if (ignore_stores) > - call_flags |= EAF_NOCLOBBER | EAF_NOESCAPE; > + call_flags |= EAF_NOCLOBBER | EAF_NOESCAPE > + | EAF_NODIRECTESCAPE; > > if (!record_ipa) > lattice[index].merge (call_flags); > @@ -1829,7 +1833,7 @@ analyze_parms (modref_summary *summary, > modref_summary_lto *summary_lto, > /* For pure functions we have implicit NOCLOBBER > and NOESCAPE. */ > if (ecf_flags & ECF_PURE) > - flags &= ~(EAF_NOCLOBBER | EAF_NOESCAPE); > + flags &= ~(EAF_NOCLOBBER | EAF_NOESCAPE | EAF_NODIRECTESCAPE); > > if (flags) > { > @@ -3098,7 +3102,7 @@ ipa_merge_modref_summary_after_inlining (cgraph_edge > *edge) > if (!ee->direct) > flags = deref_flags (flags, ignore_stores); > else if (ignore_stores) > - flags |= EAF_NOCLOBBER | EAF_NOESCAPE; > + flags |= EAF_NOCLOBBER | EAF_NOESCAPE | EAF_NODIRECTESCAPE; > flags |= ee->min_flags; > to_info->arg_flags[ee->parm_index] &= flags; > if (to_info->arg_flags[ee->parm_index]) > @@ -3112,7 +3116,7 @@ ipa_merge_modref_summary_after_inlining (cgraph_edge > *edge) > if (!ee->direct) > flags = deref_flags (flags, ignore_stores); > else if (ignore_stores) > - flags |= EAF_NOCLOBBER | EAF_NOESCAPE; > + flags |= EAF_NOCLOBBER | EAF_NOESCAPE | EAF_NODIRECTESCAPE; > flags |= ee->min_flags; > to_info_lto->arg_flags[ee->parm_index] &= flags; > if (to_info_lto->arg_flags[ee->parm_index]) > @@ -3623,8 +3627,8 @@ modref_merge_call_site_flags (escape_summary *sum, > } > else if (ignore_stores) > { > - flags |= EAF_NOESCAPE | EAF_NOCLOBBER; > - flags_lto |= EAF_NOESCAPE | EAF_NOCLOBBER; > + flags |= EAF_NOESCAPE | EAF_NOCLOBBER | EAF_NODIRECTESCAPE; > + flags_lto |= EAF_NOESCAPE | EAF_NOCLOBBER | EAF_NODIRECTESCAPE; > } > flags |= ee->min_flags; > flags_lto |= ee->min_flags; > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imend