Re: [range-ops] patch 01/04: types for VR_UNDEFINED and VR_VARYING

Jeff Law Mon, 22 Jul 2019 17:00:43 -0700

On 7/16/19 12:37 PM, Andrew MacLeod wrote:
> On 7/9/19 5:56 AM, Richard Biener wrote:
>> On Tue, Jul 9, 2019 at 9:28 AM Aldy Hernandez <al...@redhat.com> wrote:
>>>
>>>
>>> On 7/4/19 6:33 AM, Richard Biener wrote:
>>>> On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez <al...@redhat.com> wrote:
>>>>> On 7/3/19 7:08 AM, Richard Biener wrote:
>>>>>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez <al...@redhat.com>
>>>>>> wrote:
>>>>> How about we keep VARYING and UNDEFINED typeless until right before we
>>>>> call into the ranger.  At which point, we have can populate min/max
>>>>> because we have the tree_code and the type handy.  So right before we
>>>>> call into the ranger do:
>>>>>
>>>>>           if (varying_p ())
>>>>>             foo->set_varying(TYPE);
>>>>>
>>>>> This would avoid the type cache, and keep the ranger happy.
>>>> you cannot do set_varying on the static const range but instead
>>>> you'd do
>>>>
>>>>     value_range tem (*foo);
>>>>     if (varying_p ())
>>>>      tem->set_full_range (TYPE);
>>>>
>>>> which I think we already do in some places.  Thus my question _where_
>>>> you actually need this.
>>> Basically, everywhere.  By having a type for varying/undefined, we don't
>>> have to special case anything.  Sure, we could for example, special case
>>> the invert operation for undefined / varying.  And we could special case
>>> everything dealing with ranges to handle varying and undefined, but why?
>>>    We could also pass a type argument everywhere, but that's just ugly.
>>> However, I do understand your objection to the type cache.
>>>
>>> How about the attached approach?  Set the type for varying/undefined
>>> when we know it, while avoiding touching the CONST varying.  Then right
>>> before calling the ranger, pass down a new varying node with min/max for
>>> any varyings that were still typeless until that point.
>>>
>>> I have taken care of never adding a set_varying() that was not already
>>> there.  Would this keep the const happy?
>>>
>>> Technically we don't need to set varying/undef types for every instance
>>> in VRP, but we need it at least for the code that will be shared with
>>> range-ops (extract_range_from_multiplicative_op, union, intersect, etc).
>>>    I just figured if we have the information, might as well set it for
>>> consistency.
>>>
>>> If you like this approach, I can rebase the other patches that depend on
>>> this one.
>> OK, so I went ant checked what you do for class irange which has
>> a type but no kind member (but constructors with a kind).  It also
>> uses wide_int members for storage.  For a pure integer constant
>> range representation this represents somewhat odd choices;  I'd
>> have elided the m_type member completely here, it seems fully
>> redundant.  Only range operations need to be carried out in a
>> specific type (what I was suggesting above).  Even the precision
>> encoded in the wide_int members is redundant then (I'd have
>> expected widest_int here and trailing-wide-ints for optimizing
>> storage).
> 
> What irange contains internally is a bit arbitrary.  It's really an API
> for managing a set of subranges.  We could have used trees internally
> just as easily, then we wouldnt need a type field. Since we were doing
> lots of operations, rather than going back and forth from trees all the
> time, we just used the underlying wide_int directly.   we could have
> fiddle farted around with HOST_WIDE_INT or whatever, but wide_int is
> there, has all the operations, and it works fine. so thats what it
> currently is on the branch.
> 
> We are treating a range object as a unique self contained object.
> Therefore, the range has a type so we know how to print it, and can
> confirm before any operation that the ranges being operated on are
> appropriately matched.  We found and opened bugzillas over the past
> couple years for places where our code caught bugs because a range was
> created and then operated on in a way that was not compatible with
> another range.  I think there is a still an open one against ada(?)
> where the switch and case  are different precision.
> 
> From my point of view, a range object is similar to a tree node. A tree
> node has the bits to indicate what the value is, but also associates a
> type with those bits within the same object.  This is less error prone
> than passing around the bits and the type separately.  As ranges are
> starting to be used in many places outside of VRP, we should do the same
> thing with ranges.  WIth value_range it would actually be free since
> there is already a tree for the bounds already which contains the type.
> 
> 
> 
> 
> 
>> to fold_range/op_range?  The API also seems to be oddly
>> constrained to binary ops.  Anyway, the way you build
>> the operator table requires an awful lot of global C++ ctor
>> invocations, sth we generally try to avoid.  But I'm getting
>> into too many details here.
> 
> Its "oddly constrained" because what you are looking at  is just the
> standardized unary/binary ops code.. ie the equivalent of
> extract_range_from_binary_expr() and extract_range_from_unary_expr().  
> The other ops we care about have specialized requirements, like PHIs 
> and the arbitrary numbers of parameters in a call, or anything less
> common than one or two operands.    You are just not seeing those parts.
> 
>>
>> So - to answer your question above, I'd like you to pass down
>> a type to operations.  Because that's what is fundamentally
>> required - a range doesn't have a "type" and the current
>> value_range_base doesn't fall into the trap of needing one.
>>
>> Richard.
>>
> 
> Why is having a type associated with the data a "trap"?   Perhaps the
> old VRP lattice idea didn't need a type with the UNDEFINED and VARYING
> lattice values, but we're moving past lattice values and into a realm
> where we have ranges as useful things outside of VRP, and trying to
> shoehorn lattice values does not seem appropriate anymore.
> 
> I looked at implementing range-ops without a type in the range, and we
> end up passing 2 parameters everywhere each time we do anything with a
> range.  This doubled the number of parameters in most routines, and when
> we had chains of calls, we were simply passing the type along with the
> range.  It seems archaic to be constantly passing information around
> instead of associating it with the range itself.  Its just another place
> for an error to creep in.. Aldy found a place where we were creating
> varying nodes for floats IIRC.. the type checking in the range
> operations caught it precisely because the range returned wasn't the
> type it was assumed to be.
> 
> That said. I believe we can do away with the need for a type with an
> 'UNDEFINED' range.  That is not too onerous, and there doesn't really
> seem to be too many ripple effect from have a typeless undefined range.
> I think we can contain that, and prevents us from having to add a hack
> to value_range for that situation.
> 
> VARYING  is another situation completely.  We adopted the term 'varying'
> for ease of compatibility with VRP, but varying is simply a range going
> from MIN to MAX for a type.  Removing the type from that would then
> require we always pass a type in with every range which gets back to 
> doubling the number of of parameters everywhere for no good reason.
> 
> If we standardize value_range so that MIN and MAX are set for varying,
> then everything works very smoothly, we can make value_range and irange
> interchangeable and facilitate getting the range ops code into trunk.
> 
> It seems like the only reason we cant do this right now is the CONST
> varying nodes that are returned from get_value_range().
> Looking at that routine, it seems there are only 2 cases where that can
> be returned
>  1) we ask for an ssa-name beyond the end of the local ssa-name vector
>  2) once values_propagated is set  to true and an ssa-name has no entry.
> 
> Both seem pretty straightforward to fix...
> 1)  if we ask for an ssa-Name beyond the end, simply reallocate the
> vector to be big enough.   I doubt this will trigger a lot, and if we
> initially allocate it with room for an extra 10% names it'll probably
> never trigger.  Or pick whatever number seems appropriate.
> 2) if values_propagated is true, simply allocate a node for the
> ssa-name,  set it to varying and return it. THIs accomplishes the same
> thing.
> 
> the number of additional nodes we allocate will be pretty minimal, and
> it will never exceed the number of ssa-names. Then we never have to
> worry about having CONST set for a varying node either. I see places
> where there is special processing to avoid calling set_varying  because
> we dont know if the vbalue_range in hand is in constant memory and would
> cause the compiler to trap. This seems like a really bad situation, and
> it would eliminate that issue.  We can also then eliminate the places
> where VARYING is expanded to min/max for a given type.. instead we can
> just pick up min/max directly.    It seems much cleaner overall.
> 
> Something like the simple attached patch would resolve that issue, and
> remove any lurking concerns/bugs with the CONST code.
> 
> Then we can associate a type with varying, canonicalize it consistently,
> and set MIN/MAX appropriately.   This will give us full
> interchangeability between irange and value_range, establish a
> solid/consistent API,  and we can move on to the next thing :-)
> 
> Does this not seem reasonable?
One might even claim that this patch in and of itself is a step forward
independent of all the other work going on.


THe first time I saw that code when I copied it into vr-values I thought
it was ripe for fixing, but never got to it.  As it stands, it's a hack,
no more, no less and it's a hack that we can easily remove and do
something better.


Jeff

Re: [range-ops] patch 01/04: types for VR_UNDEFINED and VR_VARYING

Reply via email to