On Mon, Jun 24, 2024 at 7:20 PM Andrew MacLeod <amacl...@redhat.com> wrote: > > > On 6/22/24 09:15, Richard Biener wrote: > > On Fri, Jun 21, 2024 at 3:02 PM Andrew MacLeod <amacl...@redhat.com> wrote: > >> This patch adds > >> > >> --param=vrp-block-limit=N > >> > >> When the basic block counter for a function exceeded 'N' , VRP is > >> invoked with the new fast_vrp algorithm instead. This algorithm uses a > >> lot less memory and processing power, although it does get a few less > >> things. > >> > >> Primary motivation is cases like > >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114855 in which the 3 VRP > >> passes consume about 600 seconds of the compile time, and a lot of > >> memory. With fast_vrp, it spends less than 10 seconds total in the > >> 3 passes of VRP. This test case has about 400,000 basic blocks. > >> > >> The default for N in this patch is 150,000, arbitrarily chosen. > >> > >> This bootstraps, (and I bootstrapped it with --param=vrp-block-limit=0 > >> as well) on x86_64-pc-linux-gnu, with no regressions. > >> > >> What do you think, OK for trunk? > > + if (last_basic_block_for_fn (fun) > param_vrp_block_limit || > > + &data == &pass_data_fast_vrp) > > > > || goes to the next line. > > > > Btw, we have -Wdisabled-optimization for these cases which should > > say sth like "function has excess of %d number of basic blocks > > (--param vrp-block-limit=%d), using fast VRP algorithm" > > or so in this case. > > > > As I wrote in the PR the priority should be -O1 compile-time > > performance and memory use. > > > Yeah, I just wanted to use it as a model for "bad" cases for ranger. > Adjusted patch attached which now issues the warning. > > I also found that the transitive relations were causing a small blowup > in time for relation processing now that I turned relations on for fast > VRP. I commited a patch and fast_vrp no longer does transitives. > > If you want to experiment with enabling fast VRP at -O1, it should be > fast all the time now. I think :-) This testcase runs in about 95 > seconds on my test machine. if I turn on VRP, a single VRP pass takes > about 2.5 seconds. Its all set up, you can just add: > > NEXT_PASS (pass_fast_vrp); > > at an appropriate spot. > > > Richard. > > > >> Andrew > >> > >> PS sorry,. it doesn't help the threader in that PR :-( > > It's probably one of the known bottlenecks there - IIRC the path range > > queries make O(n^2) work. I can look at this at some point as I've > > dealt with the slowness of this pass in the past. > > > > There is param_max_jump_thread_paths that should limit searching > > but there is IIRC no way to limit the work path ranger actually does > > when resolving the query. > > > Yeah, Id like to talk to Aldy about revamping the threader now that some > of the newer facilities are available that fast_vrp uses. > > We can calculate all the outgoing ranges for a block at once with : > > // Fill ssa-cache R with any outgoing ranges on edge E, using QUERY. > bool gori_on_edge (class ssa_cache &r, edge e, range_query *query = > NULL); > > This is what the fast_vrp routines uses. We can gather all range > restrictions generated from an edge efficiently just once and then > intersect them with a known range as we walk the different paths. We > don't need the gori exports , nor any of the other on-demand bits where > we calculate each export range dynamically.. I suspect it would reduce > the workload and memory impact quite a bit, but I'm not really familiar > with exactly how the threader uses those things. > > It'd require some minor tweaking to the lazy_ssa_cache to make the > bitmap of names set accessible. This would provide similar > functionality to what the gori export () routine provides. Both > relations and inferred ranges should only need to be calculated once per > block as well and could/should/would be applied the same way if they are > present. I don't *think* the threader uses any of the def chains, but > Aldy can chip in.
+ warning (OPT_Wdisabled_optimization, + "Using fast VRP algorithm. %d basic blocks" + " exceeds %s%d limit", + n_basic_blocks_for_fn (fun), + "--param=vrp-block-limit=", + param_vrp_block_limit); This should be: warning (OPT_Wdisabled_optimization, "Using fast VRP algorithm. %d basic blocks" " exceeds %<%--param=vrp-block-limit=d%> limit", n_basic_blocks_for_fn (fun), param_vrp_block_limit); I had thought it was mentioned that options should be quoted but it is not mentioned in the coding conventions: https://gcc.gnu.org/codingconventions.html#Diagnostics But it is mentioned in https://inbox.sourceware.org/gcc/2d2bd844-2de4-ecff-7a07-b22350750...@gmail.com/ ; This is why you were getting an error as you mentioned on IRC. > > Andrew