On-Demand range technology [2/5] - Major Components : How it works

Andrew MacLeod Wed, 22 May 2019 18:29:01 -0700

*This note will talk about the 4 major components of the prototype andexplain how they work together. I will be fairly light on detail justto give an overview, we can delve into whatever details are needed.

- Range-ops : Range operations at the statement level
- GORI - Generates Outgoing Range Info : Ranges generated by basic blocks
- GORI Cache - combining and caching basic-block range info
- Ranger - API and query control


1 * Range-ops
--------------------

The first major component is the centralized range operation database.This is where range operations for tree-codes are implemented. Thecompiler currently has code which can fold ranges, but the new mechanismis a general purpose solver which can solve for other operands. If thereare N input/output operands, and we have ranges for N-1, It is oftenpossible to derive the missing range. ie

    lhs = op1 + op2

The existing code in the compiler can solve for LHS given ranges for OP1and OP2. This has been extended so that we can also sometimes solve foreither op1 or op2 e, ie

    [20,40] = op1 + [5, 10]
...can calculate that op1 has the range [15, 30]

This ability to solve for the other operands provides the ability tocalculate ranges in the reverse order we are accustomed to, and is keyto enabling the on-demand range approach.

    a_2 = b_1 - 20
    if (a_2 < 40)

A conditional jump has an implicit boolean LHS depending on which edgeis taken. To evaluate ranges on the TRUE edge of the branch, the LHS is[1,1]. To calculate the range for a_2 we simply solve the equation:

    [1,1] = a_2 < 40

which provides the answer as [0,39]. Furthermore, substituting thisrange for a_2 as the LHS of it’s definition statement:

    [0,39] = b_1 - 20

The same evaluation mechanism can calculate a result for b_1 on thetrue edge as [20,59]. This is the key feature which allows the rest ofthe algorithm to work in a general way.

All operations are driven from this component, and the only thingrequired to add support for another tree code is to implement one ormore methods for it here, and the rest of the range mechanism willsimply work with it.



2 * GORI
------------

The second component is the “Generates Outgoing Range Info” engine. This is a basic-block oriented component which determines what ssa-nameshave ranges created on outgoing edges, and how to calculate those ranges.

The last statement in the block is examined to see if it is a branchwhich generates range info, and then determines which ssa-names in theblock can have ranges calculated. It quickly walks the use-def chainfrom the branch to find other definitions within the block that canimpact the branch and could also have their ranges calculated. Thissummary is then cached for future use.

The GORI component also uses this summary info to perform thecalculations necessary to determine the outgoing range for any ssa_namewhich can be determined. For example:

    c_3 = foo ()
    a_2 = b_1 - 20
    If (a_2 < 40)

The summary information would indicate that b_1 and a_2 can have theiroutgoing ranges calculated for this block, and uses the cachedinformation to quickly calculate them when required.

The API consists of 2 basic methods, query and calculate:
 - bool has_edge_range_p (edge, name);
 - range outgoing_edge_range (edge, name);

If a query is made for any other ssa-name, it simply returns false andsays this block does not generate a range for that name. This is a keyrationale for the summary so we only spend time processing names forblocks that actually have ranges generated.



3 * GORI cache
---------------------

The third component builds on the basic block GORI component, and addsthe ability to traverse the CFG and combine the various ranges providedby each basic block into a cache.

It provides both a global-range cache and a range-on-entry cachesummarizing the range of an ssa_name on entry to the block.

The range-on-entry cache is self filling, and iteratively evaluates backedges. There is a single API entry point which asks for the range onentry, and if the data has not been calculated/cached already, it spawnsthe following process to calculate the data. : * Walk the CFG from the current block to the definition block,and/or any block which has range-on-entry information alreadycalculated. These end points are the source blocks.

    * Any non-source blocks visited are added to a worklist to be updated.

* Starting with the source blocks, push the known outgoing rangefrom each block to each successor and update their live-on-entry valueswhen they are visited. If the incoming range changes, mark this block’ssuccessors as requiring updating.

    * Continue iterating until no more updates are required.

The ordering is planned by the ranger such that if there are no backedges, it is typically a straightforward single pass. When back edgesare involved, the amount of iterating is typically very small. Theupdating is restricted to a single ssa-name, meaning it doesn’t get intofeedback loops with other names nor PHIs. . . It usually converges veryquickly.

It is important to note that this works exclusively with static rangesof only a single ssa-name at a time. Ie, ranges which are implicitlyexposed in the IL, and only name being examined. The values returned bythese queries are not dependent on changes in other ssa-names, which ishow the iteration process never gets out of control.

The ranger making the calls to fill this cache has a higher leveloverview, and requests these ranges in definition order such that anyssa-names feeding the definition of a name having its cache filled areresolved first, providing the best possible results the first time.

    a_2 = b_1 - 20
    If (a_2 < 40)

If the range for a_2 is requested on the true side, the ranger willfirst calculate the range of b_1 on entry to the block. Then use this tocalculate the global range of a_2, and finally for the outgoing range onthe desired edge.

If at some later point, it is discovered that the incoming range of b_1has changed in such a way that it has an impact on the outgoing range ofa_2, the iterative update process can be reapplied by the ranger toupdate the relevant cache entries. This is usually only required incases where multiple ssa-names are affected by back edges and feed eachother.


4 * Ranger
----------------

The final component is the Ranger which provides a simple API to clientswhere they can make various simple requests:

    - Range of an ssa-name at a statement location
    - Range of the result of a stmt
    - Range of an ssa-name on an edge
    - Range of an ssa-name after the last statement in a block
    - Range of an ssa name on entry to a block

The ranger is responsible for processing the IL, walking use/def chains and coordinating the various calls into the GORI components topre-satisfy as many conditions as possible before any cache filling isperformed. It is also responsible for triggering any additional updateswhich may be required due to newly discovered ranges. We in fact don’tdo this yet, but is rarely required as it turns out.


Summary
---------------

All evaluations are done on-demand. If no queries are made, there is nocost. The Ranger has no prerequisites other than a valid IL and CFG. Itdoes not need dominators, nor does it require any other changes withinan optimization pass in order to be used.

When queries are made, only the minimum effort required is made tosatisfy the request. This is a big win when it comes to passes whichonly need occasional range information such as warning passes. Some ofthese passes actually instantiate a new ranger each time they make arequest, as a demonstration of the low overhead.

As for passes such as VRP which require complete range information, thecaching process prevents the same information from being calculated morethan once. This means a similar amount of work is done with thisapproach, as with the traditional top-down approach currently beingused, except we also process the back edges and iteratively solve for them.


**Comments and feedback always welcome!Thanks
Andrew
*

On-Demand range technology [2/5] - Major Components : How it works

Reply via email to