Now that stage 1 has reopened, I’d like to reopen a discussion about the
technology and experiences we have from the Ranger project I brought up
last year. https://gcc.gnu.org/ml/gcc/2018-05/msg00288.html . (The
original wiki pages are now out of date, and I will work on updating
them soon.)
The Ranger is designed to evaluate ranges on-demand rather than through
a top-down approach. This means you can ask for a range from anywhere,
and it walks back thru the IL satisfying any preconditions and doing the
required calculations. It utilizes a cache to avoid re-doing work. If
ranges are processed in a forward dominator order, it’s not much
different than what we do today. Due to its nature, the order you
process things in has minimal impact on the overall time… You can do it
in reverse dominator order and get similar times.
It requires no outside preconditions (such as dominators) to work, and
has a very simple API… Simply query the range of an ssa_name at any
point in the IL and all the details are taken care of.
We have spent much of the past 6 months refining the prototype (branch
“ssa-range”) and adjusting it to share as much code with VRP as
possible. They are currently using a common code base for extracting
ranges from statements, as well as simplifying statements.
The Ranger deals with just ranges. The other aspects of VRP are
intended to be follow on work that integrates tightly with it, but are
also independent and would be available for other passes to use. These
include:
- Equivalency tracking
- Relational processing
- Bitmask tracking
We have implemented a VRP pass that duplicates the functionality of EVRP
(other than the bits mentioned above), as well as converted a few other
passes to use the interface.. I do not anticipate those missing bits
having a significant impact on the results.
The prototype branch it quite stable and can successfully build and test
an entire Fedora distribution (9174 packages). There is an issue with
switches I will discuss later whereby the constant range of a switch
edge is not readily available and is exponentially expensive to
calculate. We have a design to address that problem, and in the common
case we are about 20% faster than EVRP is.
When utilized in passes which only require ranges for a small number of
ssa-names we see significant improvements. The sprintf warning pass for
instance allows us to remove the calculations of dominators and the
resulting forced walk order. We see a 95% speedup (yes, 1/20th of the
overall time!). This is primarily due to no additional overhead and
only calculating the few things that are actually needed. The walloca
and wrestrict passes are a similar model, but as they have not been
converted to use EVRP ranges yet, we don’t see similar speedups there.
That is the executive summary. I will go into more details of each
major thing mentioned in follow on notes so that comments and
discussions can focus on one thing at a time.
We think this approach is very solid and has many significant benefits
to GCC. We’d like to address any concerns you may have, and work towards
finding a way to integrate this model with the code base during this
stage 1.
Comments and feedback always welcome!
Thanks
Andrew