With the opening of stage 1, It seems like an appropriate time to talk
about our vision for Ranger in GCC 12.
First, we have a queued up a list of
improvements/fixes/adjustments/performance to the existing code which
we'll check in first. We'll cover each of those in the patches as we
submit them.
Second: We've been busy since last November working on the missing
pieces. A high level overview of the changes:
1. There is now a Relation Oracle which can indicate whether there is a
relationship between 2 ssa-names. It is contextual and dominance
based, and a totally separate component. It also contains an
equivalency tracker which can indicate what ssa-names are equivalent
to other names.
2. Equivalencies/relations are integrated into range-query, range-ops
and gori, and can be used to assist with improving ranges.
3. A new common API interface accessed via cfun has been created. This
will make it transparent whether the source of range information is
global ranges, or a ranger invoked at the request of the pass.
Clients will no longer be required to manage an instance on their own.
4. A new threader has been developed which also utilizes some of the
dependency properties of GORI and achieves 45% more thread-able paths.
5. With all these additions, ranger/gori has also been reorganized a
bit to:
* Better integrate the new components,
* Exposing non-range-ops stmts to the same generalized interface
for folding and outgoing edge calculations,
* Expose relations to all stmt kind evaluations (including builtin
functions).
Over the next few weeks we'll start introducing patch sets to bring all
these changes to trunk. There will be more details in each of those
patch sets, and we'll space them out a bit to help identify any
immediate issues. I'll also address any performance impact with each
patch-set as they are introduced.
Third: Current status vs EVRP/VRP.
1. We constantly run builds against a hybrid benchmark of "new gets" to
ensure we aren't losing opportunities. When either pass in hybrid
mode produces a constant value the other doesn't, its tagged in the
listing. Running at -O2 over 396 pre-processed gcc source files,
Ranger picks up 4800+ more constants/stmt folds than EVRP does, and
evrp currently finds 6 constants that ranger does not get. We are
still working on those :-)
2. With all the additional work, Ranger is approaching a place where I
would consider it a viable replacement for EVRP. Once all the
outstanding patch sets have been applied and allowed to settle for a
bit, I would propose we change the default evrp-mode from the
current hybrid mode to "ranger-only" . I would also suggest leave
the hybrid ability in place for a while to allow back checking and
as a fail-safe. Then we can look at trimming out code we don't need
any more later in stage 1.
3. Aldy has been doing some comparisons against original VRP recently
and we think ranger is becoming a pretty solid replacement. We'll
bring this up again later in the summer when all the ranger changes
are stable and re-evaluate. VRP is trickier to compare against
thanks to the ASSERT_EXPRS, but running another ranger pass
immediately before VRP1 & VRP2 and comparing conditions that are
folded away we see VRP1: 5482 folds vs Ranger: 6338 folds, and
VRP2: 376 fold vs Ranger: 467 Folds. VRP1 currently folds 77
branches Ranger misses, and likewise VRP2 folds 199 that Ranger
misses. We'll eventually get to investigating what those are and
resolving them. In summary Ranger currently folds 15% & 24% more
branches, but also misses some.
Fourth: Plans for this GCC 12/stage 1
I have a very long laundry list of things, the challenge is determining
what we can get into this release. :-)
We have a few high level goals for this release. Ideally, we'd like to
* Run EVRP in ranger only mode.
* Remove VRP and replace it with an identical EVRP ranger-only pass.
* Remove any other compiler references to symbolics in value_range.
* Move irange back to a wide-int only implementation once we no longer
need to support legacy compatibility. This should also result in a
noticeable speed increase.
Additional focus during this release will be for
* ease of use .. ensuring the interface for all passes is easy to use
and performs well.
* performance improvements.
* resolve as many of the ~29 or so outstanding PRs as possible.
The longer laundry list of outstanding projects consists of:
* Expand the cfun interface such that there is a persistent global
cache when a ranger isn't active and all passes use the new
range_query interface rather than the current SSA_NAME_RANGE_INFO()
or get_range_info() mechanism.
* Add stmt side effects to register ranges that are produced against
operands that are not a result. This will also replace the current
immediate-use tracking non-null processing code, as well as non-zero
after divides, etc.
* Allow injection of new ranges when a pass can determine something
outside rangers abilities.
* Additional enhancement to the basic relation processing. We are
currently missing various transitive relationships that would be
handy, among others.
* The basic ability to use relations on fold/op1_range/op2_range
calculations is there, but not fully realized. there are numerous
instances to be fleshed out.
* Furthermore, also flesh out the ability to use relations with all
the other kinds of stmts, not just range-ops, but also builtins etc.
* Revamp the on-entry range cache to be more dominance based - purely
performance.
* Bit tracking support.
* vrange support.. A new general base class allowing us to implement
ranges for floating point, native pointers, complex int, etc using
the same general framework.
* Remove legacy code once it is no longer needed.
I further anticipate additional uses of ranger in other passes, like the
new threading pass Aldy plans to introduce.
The final thing I plan to incorporate is some documentation of ranger
technology. There are a lot of features and data available in the
Ranger ecosystem that could be useful elsewhere. No one really knows
about them, and rather than trying to write everything up at once, I
propose to write a *[ranger tech] *post once a week. I'd send it to this
gcc list introducing some Ranger component and how it works. Small
pieces like this make it more likely someone will read and identify
something useful to them. And perhaps we can find ways to make them more
generally applicable to other uses. These will then be incorporated into
a wiki-page for documentation purposes, and over time should form a good
basis of documentation without overwhelming anyone (including myself :-)
with a huge doc drop all at once.
Its been a busy 6 months, and it appears to also be a very busy stage 1.
Please, feel free to offer comments on this plan or on individual
components as we introduce them.
Andrew & Aldy