This patch adds

    --param=vrp-block-limit=N

When the basic block counter for a function exceeded 'N' , VRP is invoked with the new fast_vrp algorithm instead.   This algorithm uses a lot less memory and processing power, although it does get a few less things.

Primary motivation is cases like https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114855 in which the 3  VRP passes consume about 600 seconds of the compile time, and a lot of memory.      With fast_vrp, it spends less than 10 seconds total in the 3 passes of VRP.     This test case has about 400,000 basic blocks.

The default for N in this patch is 150,000,  arbitrarily chosen.

This bootstraps, (and I bootstrapped it with --param=vrp-block-limit=0 as well) on x86_64-pc-linux-gnu, with no regressions.

What do you think, OK for trunk?

Andrew

PS sorry,. it doesn't help the threader in that PR :-(

From 3bb9bd3ca8038676e45b0bddcda91cbed7e51662 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod <amacl...@redhat.com>
Date: Mon, 17 Jun 2024 11:38:46 -0400
Subject: [PATCH 4/5] Add param for bb limit to invoke fast_vrp.

If the basic block count is too high, simply use fast_vrp for all
VRP passes.

	gcc/doc/
	* invoke.texi (vrp-block-limit): Document.

	gcc/
	* params.opt (-param=vrp-block-limit): New.
	* tree-vrp.cc (fvrp_folder::execute): Invoke fast_vrp if block
	count exceeds limit.
---
 gcc/doc/invoke.texi | 3 +++
 gcc/params.opt      | 4 ++++
 gcc/tree-vrp.cc     | 4 ++--
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5d7a87fde86..f2f8f6334dc 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -16840,6 +16840,9 @@ this parameter.  The default value of this parameter is 50.
 @item vect-induction-float
 Enable loop vectorization of floating point inductions.
 
+@item vrp-block-limit
+Maximum number of basic blocks before VRP switches to a lower memory algorithm.
+
 @item vrp-sparse-threshold
 Maximum number of basic blocks before VRP uses a sparse bitmap cache.
 
diff --git a/gcc/params.opt b/gcc/params.opt
index d34ef545bf0..c17ba17b91b 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -1198,6 +1198,10 @@ The maximum factor which the loop vectorizer applies to the cost of statements i
 Common Joined UInteger Var(param_vect_induction_float) Init(1) IntegerRange(0, 1) Param Optimization
 Enable loop vectorization of floating point inductions.
 
+-param=vrp-block-limit=
+Common Joined UInteger Var(param_vrp_block_limit) Init(150000) Optimization Param
+Maximum number of basic blocks before VRP switches to a fast model with less memory requirements.
+
 -param=vrp-sparse-threshold=
 Common Joined UInteger Var(param_vrp_sparse_threshold) Init(3000) Optimization Param
 Maximum number of basic blocks before VRP uses a sparse bitmap cache.
diff --git a/gcc/tree-vrp.cc b/gcc/tree-vrp.cc
index 4fc33e63e7d..eef02146ec6 100644
--- a/gcc/tree-vrp.cc
+++ b/gcc/tree-vrp.cc
@@ -1330,9 +1330,9 @@ public:
   unsigned int execute (function *fun) final override
     {
       // Check for fast vrp.
-      if (&data == &pass_data_fast_vrp)
+      if (last_basic_block_for_fn (fun) > param_vrp_block_limit ||
+	  &data == &pass_data_fast_vrp)
 	return execute_fast_vrp (fun, final_p);
-
       return execute_ranger_vrp (fun, final_p);
     }
 
-- 
2.45.0

Reply via email to