On 7/16/19 12:37 PM, Andrew MacLeod wrote: > On 7/9/19 5:56 AM, Richard Biener wrote: >> On Tue, Jul 9, 2019 at 9:28 AM Aldy Hernandez <al...@redhat.com> wrote: >>> >>> >>> On 7/4/19 6:33 AM, Richard Biener wrote: >>>> On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez <al...@redhat.com> wrote: >>>>> On 7/3/19 7:08 AM, Richard Biener wrote: >>>>>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez <al...@redhat.com> >>>>>> wrote: >>>>> How about we keep VARYING and UNDEFINED typeless until right before we >>>>> call into the ranger. At which point, we have can populate min/max >>>>> because we have the tree_code and the type handy. So right before we >>>>> call into the ranger do: >>>>> >>>>> if (varying_p ()) >>>>> foo->set_varying(TYPE); >>>>> >>>>> This would avoid the type cache, and keep the ranger happy. >>>> you cannot do set_varying on the static const range but instead >>>> you'd do >>>> >>>> value_range tem (*foo); >>>> if (varying_p ()) >>>> tem->set_full_range (TYPE); >>>> >>>> which I think we already do in some places. Thus my question _where_ >>>> you actually need this. >>> Basically, everywhere. By having a type for varying/undefined, we don't >>> have to special case anything. Sure, we could for example, special case >>> the invert operation for undefined / varying. And we could special case >>> everything dealing with ranges to handle varying and undefined, but why? >>> We could also pass a type argument everywhere, but that's just ugly. >>> However, I do understand your objection to the type cache. >>> >>> How about the attached approach? Set the type for varying/undefined >>> when we know it, while avoiding touching the CONST varying. Then right >>> before calling the ranger, pass down a new varying node with min/max for >>> any varyings that were still typeless until that point. >>> >>> I have taken care of never adding a set_varying() that was not already >>> there. Would this keep the const happy? >>> >>> Technically we don't need to set varying/undef types for every instance >>> in VRP, but we need it at least for the code that will be shared with >>> range-ops (extract_range_from_multiplicative_op, union, intersect, etc). >>> I just figured if we have the information, might as well set it for >>> consistency. >>> >>> If you like this approach, I can rebase the other patches that depend on >>> this one. >> OK, so I went ant checked what you do for class irange which has >> a type but no kind member (but constructors with a kind). It also >> uses wide_int members for storage. For a pure integer constant >> range representation this represents somewhat odd choices; I'd >> have elided the m_type member completely here, it seems fully >> redundant. Only range operations need to be carried out in a >> specific type (what I was suggesting above). Even the precision >> encoded in the wide_int members is redundant then (I'd have >> expected widest_int here and trailing-wide-ints for optimizing >> storage). > > What irange contains internally is a bit arbitrary. It's really an API > for managing a set of subranges. We could have used trees internally > just as easily, then we wouldnt need a type field. Since we were doing > lots of operations, rather than going back and forth from trees all the > time, we just used the underlying wide_int directly. we could have > fiddle farted around with HOST_WIDE_INT or whatever, but wide_int is > there, has all the operations, and it works fine. so thats what it > currently is on the branch. > > We are treating a range object as a unique self contained object. > Therefore, the range has a type so we know how to print it, and can > confirm before any operation that the ranges being operated on are > appropriately matched. We found and opened bugzillas over the past > couple years for places where our code caught bugs because a range was > created and then operated on in a way that was not compatible with > another range. I think there is a still an open one against ada(?) > where the switch and case are different precision. > > From my point of view, a range object is similar to a tree node. A tree > node has the bits to indicate what the value is, but also associates a > type with those bits within the same object. This is less error prone > than passing around the bits and the type separately. As ranges are > starting to be used in many places outside of VRP, we should do the same > thing with ranges. WIth value_range it would actually be free since > there is already a tree for the bounds already which contains the type. > > > > > >> to fold_range/op_range? The API also seems to be oddly >> constrained to binary ops. Anyway, the way you build >> the operator table requires an awful lot of global C++ ctor >> invocations, sth we generally try to avoid. But I'm getting >> into too many details here. > > Its "oddly constrained" because what you are looking at is just the > standardized unary/binary ops code.. ie the equivalent of > extract_range_from_binary_expr() and extract_range_from_unary_expr(). > The other ops we care about have specialized requirements, like PHIs > and the arbitrary numbers of parameters in a call, or anything less > common than one or two operands. You are just not seeing those parts. > >> >> So - to answer your question above, I'd like you to pass down >> a type to operations. Because that's what is fundamentally >> required - a range doesn't have a "type" and the current >> value_range_base doesn't fall into the trap of needing one. >> >> Richard. >> > > Why is having a type associated with the data a "trap"? Perhaps the > old VRP lattice idea didn't need a type with the UNDEFINED and VARYING > lattice values, but we're moving past lattice values and into a realm > where we have ranges as useful things outside of VRP, and trying to > shoehorn lattice values does not seem appropriate anymore. > > I looked at implementing range-ops without a type in the range, and we > end up passing 2 parameters everywhere each time we do anything with a > range. This doubled the number of parameters in most routines, and when > we had chains of calls, we were simply passing the type along with the > range. It seems archaic to be constantly passing information around > instead of associating it with the range itself. Its just another place > for an error to creep in.. Aldy found a place where we were creating > varying nodes for floats IIRC.. the type checking in the range > operations caught it precisely because the range returned wasn't the > type it was assumed to be. > > That said. I believe we can do away with the need for a type with an > 'UNDEFINED' range. That is not too onerous, and there doesn't really > seem to be too many ripple effect from have a typeless undefined range. > I think we can contain that, and prevents us from having to add a hack > to value_range for that situation. > > VARYING is another situation completely. We adopted the term 'varying' > for ease of compatibility with VRP, but varying is simply a range going > from MIN to MAX for a type. Removing the type from that would then > require we always pass a type in with every range which gets back to > doubling the number of of parameters everywhere for no good reason. > > If we standardize value_range so that MIN and MAX are set for varying, > then everything works very smoothly, we can make value_range and irange > interchangeable and facilitate getting the range ops code into trunk. > > It seems like the only reason we cant do this right now is the CONST > varying nodes that are returned from get_value_range(). > Looking at that routine, it seems there are only 2 cases where that can > be returned > 1) we ask for an ssa-name beyond the end of the local ssa-name vector > 2) once values_propagated is set to true and an ssa-name has no entry. > > Both seem pretty straightforward to fix... > 1) if we ask for an ssa-Name beyond the end, simply reallocate the > vector to be big enough. I doubt this will trigger a lot, and if we > initially allocate it with room for an extra 10% names it'll probably > never trigger. Or pick whatever number seems appropriate. > 2) if values_propagated is true, simply allocate a node for the > ssa-name, set it to varying and return it. THIs accomplishes the same > thing. > > the number of additional nodes we allocate will be pretty minimal, and > it will never exceed the number of ssa-names. Then we never have to > worry about having CONST set for a varying node either. I see places > where there is special processing to avoid calling set_varying because > we dont know if the vbalue_range in hand is in constant memory and would > cause the compiler to trap. This seems like a really bad situation, and > it would eliminate that issue. We can also then eliminate the places > where VARYING is expanded to min/max for a given type.. instead we can > just pick up min/max directly. It seems much cleaner overall. > > Something like the simple attached patch would resolve that issue, and > remove any lurking concerns/bugs with the CONST code. > > Then we can associate a type with varying, canonicalize it consistently, > and set MIN/MAX appropriately. This will give us full > interchangeability between irange and value_range, establish a > solid/consistent API, and we can move on to the next thing :-) > > Does this not seem reasonable? One might even claim that this patch in and of itself is a step forward independent of all the other work going on.
THe first time I saw that code when I copied it into vr-values I thought it was ripe for fixing, but never got to it. As it stands, it's a hack, no more, no less and it's a hack that we can easily remove and do something better. Jeff