On Tue, Oct 22, 2024 at 4:39 AM Richard Biener <richard.guent...@gmail.com> wrote: > > On Mon, Oct 21, 2024 at 6:04 PM Andrew Pinski <pins...@gmail.com> wrote: > > > > On Mon, Oct 21, 2024 at 3:41 AM Richard Biener > > <richard.guent...@gmail.com> wrote: > > > > > > On Thu, Oct 17, 2024 at 4:43 AM Andrew Pinski <quic_apin...@quicinc.com> > > > wrote: > > > > > > > > After fixing loop-im to do the correct overflow rewriting > > > > for pointer types too. We end up with code like: > > > > ``` > > > > _9 = (unsigned long) &g; > > > > _84 = _9 + 18446744073709551615; > > > > _11 = _42 + _84; > > > > _44 = (signed char *) _11; > > > > ... > > > > *_44 = 10; > > > > g ={v} {CLOBBER(eos)}; > > > > ... > > > > n[0] = &f; > > > > *_44 = 8; > > > > g ={v} {CLOBBER(eos)}; > > > > ``` > > > > Which was not being recongized by the scope conflicts code. > > > > This was because it only handled one level walk backs rather than > > > > multiple ones. > > > > This fixes it by using a work_list to avoid huge recursion and a > > > > visited bitmap to avoid > > > > going into an infinite loops when dealing with loops. > > > > Adds a cache for the addr_expr's that are associated with each ssa > > > > name. I found 2 element > > > > cache was the decent trade off for size and speed. Most ssa names will > > > > have only > > > > one address associated with it but there are times (phis) where 2 or > > > > more will > > > > be there. But 2 is the common case for most if statements. > > > > > > Hmm. I was hoping that, since we walk BBs in RPO order in > > > add_scope_conflicts, > > > we can simply populate the cache in the first forward walk of a BBs > > > stmts without > > > needing to care for cycles or using of a worklist. We even get the > > > set of ADDR_EXPRs > > > in the stmt we visit for free, we just have to associate that set with > > > the SSA def(s). > > > I'd have used a SSA name -> bitmap mapping for simplicity. For > > > backedges the iteration > > > should get things corrected, so we should be able to use the "cache" > > > as conservative > > > answer even for those. > > > > The problem is if we had a phi of say t_3 = <&a, &b, &c> . > > At this point the cache is a fixed size cache, and only holds 2 > > elements (maybe changing it to be variable would be better) so > > sometimes the cache is invalidated due to needing to hold more than 2 > > elements. > > As said I'd use a bitmap of the ADDR_EXPR uids in the cache. The > bitmaps can be shared (just wipe their obstack at the end).
So looking into this further, I figured out there is already a condensed id that is local to the scope conflict code already which is stored via a mapping from the decl to it (decl_to_stack_part). I had originally been worried that the DECL UID would be too sparse but with this condensed id already used for bitmaps here, the problem is not as bad. Also there is already an obstack for bitmaps allocated, stack_var_bitmap_obstack which I will use too. I Hope to have a new patch by the end of the week because I have seen at least 3 more bug reports about this code and Tamar is running into a failure after his IV-OPTs cleanup patch that would be solved by this. Thanks, Andrew > > > And you would need the worklist and the bitmap visited for those > > cases. This should not happen that often but could with switches. > > If we change to use a variable sized cache, then it shouldn't happen > > due to the walk as you described. Let me change it to a variable sized > > cache. > > > > Thanks, > > Andrew > > > > > > > > Richard. > > > > > > > gcc/ChangeLog: > > > > > > > > PR middle-end/111422 > > > > * cfgexpand.cc: Define INCLUDE_STRING if ADDR_WALKER_STATS > > > > is defined. > > > > (class addr_ssa_walker): New class. > > > > (add_scope_conflicts_2): Rename to ... > > > > (addr_ssa_walker::operator()): This and rewrite to be a full > > > > walk > > > > of all operands and their uses and use a cache. > > > > (add_scope_conflicts_1): Add walker new argument for the addr > > > > cache. > > > > Just walk the phi result since that will include all addr_exprs. > > > > Change call to add_scope_conflicts_2 to walker. > > > > (add_scope_conflicts): Add walker variable and update call to > > > > add_scope_conflicts_1. > > > > > > > > Signed-off-by: Andrew Pinski <quic_apin...@quicinc.com> > > > > --- > > > > gcc/cfgexpand.cc | 207 ++++++++++++++++++++++++++++++++++++++++------- > > > > 1 file changed, 176 insertions(+), 31 deletions(-) > > > > > > > > diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc > > > > index 6c1096363af..74f4cfc0f22 100644 > > > > --- a/gcc/cfgexpand.cc > > > > +++ b/gcc/cfgexpand.cc > > > > @@ -17,6 +17,9 @@ You should have received a copy of the GNU General > > > > Public License > > > > along with GCC; see the file COPYING3. If not see > > > > <http://www.gnu.org/licenses/>. */ > > > > > > > > +#ifdef ADDR_WALKER_STATS > > > > +#define INCLUDE_STRING > > > > +#endif > > > > #include "config.h" > > > > #include "system.h" > > > > #include "coretypes.h" > > > > @@ -571,35 +574,175 @@ visit_conflict (gimple *, tree op, tree, void > > > > *data) > > > > return false; > > > > } > > > > > > > > -/* Helper function for add_scope_conflicts_1. For USE on > > > > - a stmt, if it is a SSA_NAME and in its SSA_NAME_DEF_STMT is known > > > > to be > > > > - based on some ADDR_EXPR, invoke VISIT on that ADDR_EXPR. */ > > > > +namespace { > > > > > > > > -static inline void > > > > -add_scope_conflicts_2 (tree use, bitmap work, > > > > - walk_stmt_load_store_addr_fn visit) > > > > +class addr_ssa_walker > > > > +{ > > > > +private: > > > > + struct addr_cache > > > > + { > > > > + private: > > > > + unsigned elems = 0; > > > > + static constexpr unsigned maxelements = 2; > > > > + bool visited = false; > > > > + tree cached[maxelements] = {}; > > > > + public: > > > > + /* Returns true if the cache is valid. */ > > > > + operator bool () > > > > + { > > > > + return visited && elems <= maxelements; > > > > + } > > > > + /* Mark as visited. The cache might be invalidated > > > > + by adding too many elements though. */ > > > > + void visit () { visited = true; } > > > > + /* Iterator over the cached values. */ > > > > + tree *begin () { return &cached[0]; } > > > > + tree *end () > > > > + { > > > > + /* If there was too many elements, then there are > > > > + nothing to vist in the cache. */ > > > > + if (elems > maxelements) > > > > + return &cached[0]; > > > > + return &cached[elems]; > > > > + } > > > > + /* Add ADDR_EXPR to the cache if it is not there already. */ > > > > + void add (tree addr) > > > > + { > > > > + if (elems > maxelements) > > > > + { > > > > + statistics_counter_event (cfun, "addr_walker already > > > > overflow", 1); > > > > + return; > > > > + } > > > > + /* Skip if the cache already contains the addr_expr. */ > > > > + for(tree other : *this) > > > > + if (operand_equal_p (other, addr)) > > > > + return; > > > > + elems++; > > > > + /* Record that the cache overflowed. */ > > > > + if (elems > maxelements) > > > > + { > > > > + statistics_counter_event (cfun, "addr_walker overflow", 1); > > > > + return; > > > > + } > > > > + cached[elems - 1] = addr; > > > > + } > > > > + }; > > > > +public: > > > > + addr_ssa_walker () : cache (new addr_cache[num_ssa_names]{}) { } > > > > + ~addr_ssa_walker (){ delete[] cache; } > > > > + > > > > + /* Walk the name and its defining statement, > > > > + call func with for addr_expr's. */ > > > > + template<typename T> > > > > + void operator ()(tree name, T func); > > > > + > > > > +private: > > > > + > > > > + /* Cannot create a copy. */ > > > > + addr_ssa_walker (const addr_ssa_walker &) = delete; > > > > + addr_ssa_walker (addr_ssa_walker &&) = delete; > > > > + /* Return the cache entries for a SSA NAME. */ > > > > + addr_cache &operator[] (tree name) > > > > + { > > > > + return cache[SSA_NAME_VERSION (name)]; > > > > + } > > > > + > > > > + addr_cache *cache; > > > > +}; > > > > + > > > > +/* Walk backwards on the defining statements of NAME > > > > + and call FUNC on the addr_expr. Use the cache for > > > > + the SSA name if possible. */ > > > > + > > > > +template<typename T> > > > > +void > > > > +addr_ssa_walker::operator() (tree name, T func) > > > > { > > > > - if (TREE_CODE (use) == SSA_NAME > > > > - && (POINTER_TYPE_P (TREE_TYPE (use)) > > > > - || INTEGRAL_TYPE_P (TREE_TYPE (use)))) > > > > + gcc_assert (TREE_CODE (name) == SSA_NAME); > > > > + auto_vec<std::pair<tree,tree>, 4> work_list; > > > > + auto_bitmap visited_ssa_names; > > > > + work_list.safe_push (std::make_pair (name, name)); > > > > + > > > > +#ifdef ADDR_WALKER_STATS > > > > + unsigned process_list = 0; > > > > +#endif > > > > + > > > > + while (!work_list.is_empty()) > > > > { > > > > + auto work = work_list.pop(); > > > > + tree use = work.first; > > > > + tree old_name = work.second; > > > > +#ifdef ADDR_WALKER_STATS > > > > + process_list++; > > > > +#endif > > > > + > > > > + if (!use) > > > > + continue; > > > > + /* For addr_expr's call the function and update the cache. */ > > > > + if (TREE_CODE (use) == ADDR_EXPR) > > > > + { > > > > + func (use); > > > > + (*this)[old_name].add (use); > > > > + continue; > > > > + } > > > > + /* Ignore all non SSA names. */ > > > > + if (TREE_CODE (use) != SSA_NAME) > > > > + continue; > > > > + > > > > + /* Only Pointers and integral types are used to track addresses. > > > > */ > > > > + if (!POINTER_TYPE_P (TREE_TYPE (use)) > > > > + && !INTEGRAL_TYPE_P (TREE_TYPE (use))) > > > > + continue; > > > > + > > > > + /* Check the cache, if there is a hit use it. */ > > > > + if ((*this)[use]) > > > > + { > > > > + statistics_counter_event (cfun, "addr_walker cache hit", 1); > > > > + /* Call the function and update the cache. */ > > > > + for (tree naddr : (*this)[use]) > > > > + { > > > > + (*this)[old_name].add (naddr); > > > > + func (naddr); > > > > + } > > > > + continue; > > > > + } > > > > gimple *g = SSA_NAME_DEF_STMT (use); > > > > + /* Mark the use as being visited so even if the cache is empty, > > > > + there is no reason to walk the names backwards. */ > > > > + (*this)[use].visit(); > > > > + /* Skip the name if already visted this time. */ > > > > + if (!bitmap_set_bit (visited_ssa_names, SSA_NAME_VERSION (use))) > > > > + continue; > > > > + /* Need to update the old_name afterwards add it to the work > > > > list, > > > > + this will either hit the cache or the check bitmap will skip it > > > > + if there was too many names associated with the cache. */ > > > > + work_list.safe_push (work); > > > > + > > > > + /* For assign statements, add each operand to the work list. > > > > + Note operand 0 is the same as the use here so there is nothing > > > > + to be done. */ > > > > if (gassign *a = dyn_cast <gassign *> (g)) > > > > { > > > > - if (tree op = gimple_assign_rhs1 (a)) > > > > - if (TREE_CODE (op) == ADDR_EXPR) > > > > - visit (a, TREE_OPERAND (op, 0), op, work); > > > > + for (unsigned i = 1; i < gimple_num_ops (g); i++) > > > > + work_list.safe_push (std::make_pair (gimple_op (a, i), > > > > use)); > > > > } > > > > + /* PHI nodes, add each argument to the work list. */ > > > > else if (gphi *p = dyn_cast <gphi *> (g)) > > > > for (unsigned i = 0; i < gimple_phi_num_args (p); ++i) > > > > - if (TREE_CODE (use = gimple_phi_arg_def (p, i)) == SSA_NAME) > > > > - if (gassign *a = dyn_cast <gassign *> (SSA_NAME_DEF_STMT > > > > (use))) > > > > - { > > > > - if (tree op = gimple_assign_rhs1 (a)) > > > > - if (TREE_CODE (op) == ADDR_EXPR) > > > > - visit (a, TREE_OPERAND (op, 0), op, work); > > > > - } > > > > - } > > > > + work_list.safe_push (std::make_pair (gimple_phi_arg_def (p, > > > > i), use)); > > > > + /* All other kind of statements assume they can't contribute to > > > > an address > > > > + being alive. */ > > > > + } > > > > + /* This stat is here to see how long is the longest walk, > > > > + it is not useful stat except when tuning > > > > + addr_ssa_walker::addr_cache::maxelements. */ > > > > +#ifdef ADDR_WALKER_STATS > > > > + statistics_counter_event (cfun, > > > > + ("addr_walker process " + std::to_string > > > > (process_list)).c_str (), > > > > + 1); > > > > +#endif > > > > +} > > > > + > > > > } > > > > > > > > /* Helper routine for add_scope_conflicts, calculating the active > > > > partitions > > > > @@ -608,7 +751,8 @@ add_scope_conflicts_2 (tree use, bitmap work, > > > > liveness. */ > > > > > > > > static void > > > > -add_scope_conflicts_1 (basic_block bb, bitmap work, bool for_conflict) > > > > +add_scope_conflicts_1 (basic_block bb, bitmap work, bool for_conflict, > > > > + addr_ssa_walker &walker) > > > > { > > > > edge e; > > > > edge_iterator ei; > > > > @@ -623,14 +767,14 @@ add_scope_conflicts_1 (basic_block bb, bitmap > > > > work, bool for_conflict) > > > > > > > > visit = visit_op; > > > > > > > > + auto addrvisitor = [&visit,&work](tree addr) { > > > > + gcc_assert (TREE_CODE (addr) == ADDR_EXPR); > > > > + visit (nullptr, TREE_OPERAND (addr, 0), addr, work); > > > > + }; > > > > + > > > > + /* Walk over the phi for the incoming addresses to be alive. */ > > > > for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > > > > - { > > > > - gimple *stmt = gsi_stmt (gsi); > > > > - gphi *phi = as_a <gphi *> (stmt); > > > > - walk_stmt_load_store_addr_ops (stmt, work, NULL, NULL, visit); > > > > - FOR_EACH_PHI_ARG (use_p, phi, iter, SSA_OP_USE) > > > > - add_scope_conflicts_2 (USE_FROM_PTR (use_p), work, visit); > > > > - } > > > > + walker (gimple_phi_result (gsi_stmt (gsi)), addrvisitor); > > > > for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > > > > { > > > > gimple *stmt = gsi_stmt (gsi); > > > > @@ -676,7 +820,7 @@ add_scope_conflicts_1 (basic_block bb, bitmap work, > > > > bool for_conflict) > > > > } > > > > walk_stmt_load_store_addr_ops (stmt, work, visit, visit, > > > > visit); > > > > FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE) > > > > - add_scope_conflicts_2 (USE_FROM_PTR (use_p), work, visit); > > > > + walker (USE_FROM_PTR (use_p), addrvisitor); > > > > } > > > > } > > > > > > > > @@ -707,6 +851,7 @@ add_scope_conflicts (void) > > > > bitmap work = BITMAP_ALLOC (NULL); > > > > int *rpo; > > > > int n_bbs; > > > > + addr_ssa_walker walker; > > > > > > > > /* We approximate the live range of a stack variable by taking the > > > > first > > > > mention of its name as starting point(s), and by the end-of-scope > > > > @@ -734,14 +879,14 @@ add_scope_conflicts (void) > > > > bitmap active; > > > > bb = BASIC_BLOCK_FOR_FN (cfun, rpo[i]); > > > > active = (bitmap)bb->aux; > > > > - add_scope_conflicts_1 (bb, work, false); > > > > + add_scope_conflicts_1 (bb, work, false, walker); > > > > if (bitmap_ior_into (active, work)) > > > > changed = true; > > > > } > > > > } > > > > > > > > FOR_EACH_BB_FN (bb, cfun) > > > > - add_scope_conflicts_1 (bb, work, true); > > > > + add_scope_conflicts_1 (bb, work, true, walker); > > > > > > > > free (rpo); > > > > BITMAP_FREE (work); > > > > -- > > > > 2.34.1 > > > >