Re: On-Demand range technology [3/5] - The Prototype

Jeff Law Tue, 28 May 2019 07:42:09 -0700

On 5/23/19 7:10 AM, Richard Biener wrote:
> On Thu, May 23, 2019 at 3:29 AM Andrew MacLeod <amacl...@redhat.com> wrote:
>>
>> There is a functioning prototype in branch “ssa-range” which is a proof
>> of concept that the approach is functional as well as quick, and can be
>> used to answer questions which come up regarding what it can and can’t
>> do.  Our last merge was on April 13th, so it's fairly up to date.
>>
>> We have implemented a flexible range class (irange) which allows for
>> multiple subranges to represent a range, and which can be extended in
>> the future to types other than integral. We use this throughout, but it
>> could be replaced in the ranger with any similar API. Conversion
>> routines are also provided to convert from irange to value_range and
>> value_range to irange.
>>
>> A full set of tree_code range-op routines are implemented.  We have
>> commoned as much code as possible with the existing VRP range extraction
>> code.  Also, we have added additional code for calculating the other
>> operands from a known result in numerous cases.
>>
>> The code base in VRP has been modified (via a flag) to
>>      - Work completely with the native value_range like it does today.
>>      - Use irange and the range-ops component under the covers to
>> extract ranges. Requests in VRP are then converted from value_ranges to
>> irange, called into the range-op routines, and then converted back to
>> value_range for VRP/EVRP’s use.
>>      - Do operations both ways and compare the results to make sure both
>> agree on the range, and trap if they do not.
>>
>> The branch defaults to the compare and trap mode to ensure everything is
>> working correctly. This has been our correctness model for statement
>> range extraction and was active during the Fedora package builds. The
>> only time we disabled it was to do performance runs vs RVRP, and were
>> looking at both branch and trunk times for EVRP and VRP.
>>
>> Of note, we drop all symbolics in ranges to VARYING on everything except
>> PLUS and MINUS, which we leave as native calculations if there are
>> symbolics present.  More on symbolics later.
>>
>> A VRP like pass called RVRP has been implemented.
>>      - The vr-values statement simplification code has been factored out
>> to be range agnostic, meaning that these routines can operate on either
>> value_range or irange. Thus, we are using a common code base to perform
>> statement simplification as well.
>>      - For complete compatibility with EVRP, the RVRP pass builds
>> dominators and instantiates the SCEV loop information so we have loop
>> range info available. RVRP does not need this info to run, but would
>> miss some of the test cases which depend on loop ranges.
>>      - RVRP is set up to demonstrate it can process the IL in multiple
>> directions and bootstraps/passes all tests in all directions.
>>          * Dominator order
>>          * Post-dominator order
>>          * BB1 thru BBn
>>          * BBn thru BB1
>>          * branch-only mode where only branches at the end of each BB
>> are examined for folding opportunities
>>
>> 4 additional passes have been converted to use the ranger model:
>>      - sprintf - removed the dominator building/walking
>>      - warn alloca - replaced calls to get global ranges with calls that
>> now return context sensitive ranges.
>>      - warn restrict - just replaced EVRP range calls with ranger calls.
>>      - backwards threader - enhanced to use contextual range information
>> to make additional threading decisions.
>>
>>
>> Symbolic Ranges
>>
>> One big concern last year expressed was my decision to abolish symbolic
>> ranges.
>>
>> I continue to maintain that there is no need to track the range of x_2
>> as [y_3 + 5, MAX] for x_2 = y_3 + 5.   All you need to do is look at the
>> definition of x_2, and the same information is exposed right there in
>> the IL.  If one requires the symbolic information, the same on-demand
>> process could lookup that information as well. This in turn, makes the
>> code for ranges much simpler, easier to maintain, and less likely to
>> introduce bugs.
>>
>> We have found through our prototype that symbolics in ranges are not
>> nearly as prevalent as one would think.   During the work to share a
>> common code base with VRP, we found that symbolic ranges are irrelevant
>> for everything except PLUS_EXPR and MINUS_EXPR. The shared code in our
>> branch drops symbolics to varying immediately for everything else, and
>> it has no impact on EVRP, or VRP or any tests we can find anywhere.
>> Furthermore, we never trapped while comparing ranges generated by VRP
>> versus generating them with range-ops which drops symbolics to varying.
>>
>> We tried modifying VRP such that we don’t even create symbolic
>> endpoints, but rather revert to VARYING always.  We can find no test
>> case that fails because a range is not calculated properly due to
>> resolving these endpoints.
>>
>> There are a few that fail due to the symbolic being used to help track
>> relationals.. Ie
>>
>>       x_2 = y_3 + 5
>>      If (x_2 > y_3)     // can be folded since we know x_2 must be < y_3
>>
>> VRP generates a range for x of  [ y_3+5, MAX ] and at various points
>> uses that to infer a relational or equivalence.   Ie, it becomes easy to
>> tell that the condition must always be true since the lower bound of the
>> range is y_3 + 5.
>>
>> I argue this is not a range question, but rather a different problem
>> which VRP has chosen to solve by piggybacking on the range
>> representation.  This leads to complications/complexity when trying to
>> evaluate ranges because they must constantly be on the lookout for
>> symbolics.  This information is then carried around for the life of the
>> pass, even if it is never used.  It also forces any
>> relational/equivalency queries to be handled within the context of the
>> VRP pass.
>>
>> This aspect of symbolics would be handled by a relational/equivalence
>> processing engine that would be follow on work.  Using the same basic
>> model as ranges, each tree code is taught to understand the relation
>> between its operands, and then we can answer equivalency and relational
>> accurately as well.  It would be available for any pass to use
>> independent of ranges. I will expound upon that a bit in the future
>> directions section.
> 
> While I agree that symbolic ranges are a complication and that most
> cases it currently handles are not "value-range" things I do not agree
> with the idea that we can simply remove handling them and deal
> with the fallout in some later point in the future.  Eric may also be
> able to show you "real" symbolic value-range magic.
> 
> Note that symbolic ranges are already restricted to PLUS_EXPR
> and MINUS_EXPR (and NEGATE_EXPR I think).  There are
> also "symbolic" (non-integer constant) ranges like [&a, &a].
> I've never seen [&a, &MEM[&a + 4]] but we wouldn't reject those
> I think.
> 
> You may have noticed that EVRP does not use symbolic ranges.
> 
> As already said I'd like to see VRP go but obstackles are
> symbolic ranges and jump-threading (with Jeff promising
> to handle the jump-threading part in the past).
Right.  FWIW, one of the follow-on items to this work is Aldy's
improvements to backwards jump threading which utilizes the ranger
framework -- the primary purpose of that work is to eliminate the need
for jump threading in VRP.


jeff

Re: On-Demand range technology [3/5] - The Prototype

Reply via email to