The problem with trying to solve this problem on a per pass basis rather
than coming up with an integrate solution is that we are completely
leaving the user out of the thought process.
There are some uses who have big machines or a lot of time on their
hands and want the damn the torpedoes full speed ahead and there are
some uses that want reasonable decisions made even at high
optimization. We need to give them any easy to turn knob.
This is why we have O2 vs O3 and why we have -fexpensive-optimizations.
I am not saying that my original proposal was the best of all possible
worlds, but solving hacking things on a pass by pass or pr by pr basis
is not really solving the problem.
Sure it is.
The problem with your approach is that most of the algorithms in GCC
that sometimes have very bad times with large functions do not have
such simple qualities as "they take a long time when n_basic_blocks is
large".
First, almost none of the algorithms are super-linear in any
easy-to-calculate-on-a-global-basis-way. The only easy ones are
regalloc and non-modulo scheduling. Everything else is just not
stupid enough to be N^2 in the number of instructions or basic blocks.
Take points-to analysis, one of our N^3 worst case algorithms.
You can throttle PTA by losing precision for speed very easily.
However, the time bound is very complex.
It takes N iterations where N is the length of the largest
uncollapseable cycle in the constraint graph. Each iteration takes
V*N time where V is the number of non-unifiable variables, and N is
the size of the pointed-to set.
Uncollapseable cycles only occur when you have address taking of
pointed to fields, or pointer arithmetic on pointers to structures (IE
a = a->next).
Variables can't be unified or collapsed only in some odd cases.
In almost *all* cases, it acts linear.
I can tell you whether points-to is going to take an hour or two
*after* we've run constructed and simplified the constraint graph (IE
before we spend time solving it).
I certainly can't tell you based on the number of basic blocks or instructions.
Want another example?
Take GVN-PRE, which is potentially N^2.
The time bound is related to the largest number of distinct values
that occur in a function. Even on very large functions, this may not
be a lot.
We have plenty of incredibly large functions that just don't take a
lot of time in PRE.
The only way to make reasonable decisions about time is on a per-pass
basis, because our passes just don't have bad cases that correspond to
global heuristics.