On Tue, 6 Aug 2024, Andrew MacLeod wrote: > > On 8/6/24 09:12, Richard Biener wrote: > > Currently the forward threader isn't limited as to the search space > > it explores and with it now using path-ranger for simplifying > > conditions it runs into it became pretty slow for degenerate cases > > like compiling insn-emit.cc for RISC-V esp. when compiling for > > a host with LOGICAL_OP_NON_SHORT_CIRCUIT disabled. > > > > The following makes the forward threader honor the search space > > limit I introduced for the backward threader. This reduces > > compile-time from minutes to seconds for the testcase in PR116166. > > > > Note this wasn't necessary before we had ranger but with ranger > > the work we do is quadatic in the length of the threading path > > we build up (the same is true for the backwards threader). > > Theres probably some work that can be done in the path processing space using > the new gori_on_edge interface I introduced for the fast VRP pass. > > // These APIs are used to query GORI if there are ranges generated on an edge. > // GORI_ON_EDGE is used to get all the ranges at once (returned in an > // ssa_cache structure). > // Fill ssa-cache R with any outgoing ranges on edge E, using QUERY. > bool gori_on_edge (class ssa_cache &r, edge e, range_query *query = NULL); > > With this, the threader and path calculator can get and collect all the > outgoing ranges for a block in linear time and just keep them.. and decide > what it wants to use. I suspect for really large CFGs, we'd want to > substitute and alternative ssa_cache implementation to something like the > sbr_sparse_bitmap class ranger's cache uses which compresses the size of the > vector so it isn't a vector over all the ssa-names, and at the same time limit > it to a max of 14 outgoing ranges. > > no one has had any time to investigate that yet.
The thing that path-ranger handles in addition to what range_on_edge gives you is CFG merges where knowing the path into it improves ranges downstream. Basically path-ranger improves the range of PHIs based on known path taken. Whenever the decision which edge(s) of a PHI are taken changes we'd have to wipe cached information. In PR114855 the gori compute itself shows scalability issues (I've identified the dominator walk sofar but it might not be the full story). > > > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. > > > > OK if that succeeds? > OK with me. Thanks, pushed and queued for backport. Richard.