On 5/23/19 7:10 AM, Richard Biener wrote: > On Thu, May 23, 2019 at 3:29 AM Andrew MacLeod <amacl...@redhat.com> wrote: >> >> There is a functioning prototype in branch “ssa-range” which is a proof >> of concept that the approach is functional as well as quick, and can be >> used to answer questions which come up regarding what it can and can’t >> do. Our last merge was on April 13th, so it's fairly up to date. >> >> We have implemented a flexible range class (irange) which allows for >> multiple subranges to represent a range, and which can be extended in >> the future to types other than integral. We use this throughout, but it >> could be replaced in the ranger with any similar API. Conversion >> routines are also provided to convert from irange to value_range and >> value_range to irange. >> >> A full set of tree_code range-op routines are implemented. We have >> commoned as much code as possible with the existing VRP range extraction >> code. Also, we have added additional code for calculating the other >> operands from a known result in numerous cases. >> >> The code base in VRP has been modified (via a flag) to >> - Work completely with the native value_range like it does today. >> - Use irange and the range-ops component under the covers to >> extract ranges. Requests in VRP are then converted from value_ranges to >> irange, called into the range-op routines, and then converted back to >> value_range for VRP/EVRP’s use. >> - Do operations both ways and compare the results to make sure both >> agree on the range, and trap if they do not. >> >> The branch defaults to the compare and trap mode to ensure everything is >> working correctly. This has been our correctness model for statement >> range extraction and was active during the Fedora package builds. The >> only time we disabled it was to do performance runs vs RVRP, and were >> looking at both branch and trunk times for EVRP and VRP. >> >> Of note, we drop all symbolics in ranges to VARYING on everything except >> PLUS and MINUS, which we leave as native calculations if there are >> symbolics present. More on symbolics later. >> >> A VRP like pass called RVRP has been implemented. >> - The vr-values statement simplification code has been factored out >> to be range agnostic, meaning that these routines can operate on either >> value_range or irange. Thus, we are using a common code base to perform >> statement simplification as well. >> - For complete compatibility with EVRP, the RVRP pass builds >> dominators and instantiates the SCEV loop information so we have loop >> range info available. RVRP does not need this info to run, but would >> miss some of the test cases which depend on loop ranges. >> - RVRP is set up to demonstrate it can process the IL in multiple >> directions and bootstraps/passes all tests in all directions. >> * Dominator order >> * Post-dominator order >> * BB1 thru BBn >> * BBn thru BB1 >> * branch-only mode where only branches at the end of each BB >> are examined for folding opportunities >> >> 4 additional passes have been converted to use the ranger model: >> - sprintf - removed the dominator building/walking >> - warn alloca - replaced calls to get global ranges with calls that >> now return context sensitive ranges. >> - warn restrict - just replaced EVRP range calls with ranger calls. >> - backwards threader - enhanced to use contextual range information >> to make additional threading decisions. >> >> >> Symbolic Ranges >> >> One big concern last year expressed was my decision to abolish symbolic >> ranges. >> >> I continue to maintain that there is no need to track the range of x_2 >> as [y_3 + 5, MAX] for x_2 = y_3 + 5. All you need to do is look at the >> definition of x_2, and the same information is exposed right there in >> the IL. If one requires the symbolic information, the same on-demand >> process could lookup that information as well. This in turn, makes the >> code for ranges much simpler, easier to maintain, and less likely to >> introduce bugs. >> >> We have found through our prototype that symbolics in ranges are not >> nearly as prevalent as one would think. During the work to share a >> common code base with VRP, we found that symbolic ranges are irrelevant >> for everything except PLUS_EXPR and MINUS_EXPR. The shared code in our >> branch drops symbolics to varying immediately for everything else, and >> it has no impact on EVRP, or VRP or any tests we can find anywhere. >> Furthermore, we never trapped while comparing ranges generated by VRP >> versus generating them with range-ops which drops symbolics to varying. >> >> We tried modifying VRP such that we don’t even create symbolic >> endpoints, but rather revert to VARYING always. We can find no test >> case that fails because a range is not calculated properly due to >> resolving these endpoints. >> >> There are a few that fail due to the symbolic being used to help track >> relationals.. Ie >> >> x_2 = y_3 + 5 >> If (x_2 > y_3) // can be folded since we know x_2 must be < y_3 >> >> VRP generates a range for x of [ y_3+5, MAX ] and at various points >> uses that to infer a relational or equivalence. Ie, it becomes easy to >> tell that the condition must always be true since the lower bound of the >> range is y_3 + 5. >> >> I argue this is not a range question, but rather a different problem >> which VRP has chosen to solve by piggybacking on the range >> representation. This leads to complications/complexity when trying to >> evaluate ranges because they must constantly be on the lookout for >> symbolics. This information is then carried around for the life of the >> pass, even if it is never used. It also forces any >> relational/equivalency queries to be handled within the context of the >> VRP pass. >> >> This aspect of symbolics would be handled by a relational/equivalence >> processing engine that would be follow on work. Using the same basic >> model as ranges, each tree code is taught to understand the relation >> between its operands, and then we can answer equivalency and relational >> accurately as well. It would be available for any pass to use >> independent of ranges. I will expound upon that a bit in the future >> directions section. > > While I agree that symbolic ranges are a complication and that most > cases it currently handles are not "value-range" things I do not agree > with the idea that we can simply remove handling them and deal > with the fallout in some later point in the future. Eric may also be > able to show you "real" symbolic value-range magic. > > Note that symbolic ranges are already restricted to PLUS_EXPR > and MINUS_EXPR (and NEGATE_EXPR I think). There are > also "symbolic" (non-integer constant) ranges like [&a, &a]. > I've never seen [&a, &MEM[&a + 4]] but we wouldn't reject those > I think. > > You may have noticed that EVRP does not use symbolic ranges. > > As already said I'd like to see VRP go but obstackles are > symbolic ranges and jump-threading (with Jeff promising > to handle the jump-threading part in the past). Right. FWIW, one of the follow-on items to this work is Aldy's improvements to backwards jump threading which utilizes the ranger framework -- the primary purpose of that work is to eliminate the need for jump threading in VRP.
jeff