Re: [PATCH] have __builtin_object_size handle POINTER_PLUS with non-const offset (pr 77608)

Richard Biener Fri, 02 Dec 2016 01:29:34 -0800

On Thu, Dec 1, 2016 at 6:58 PM, Martin Sebor <mse...@gmail.com> wrote:
>> Sure - but then you maybe instead want to check for op being in
>> range [0, max-of-signed-type-of-op] instead?  So similar to
>> expr_not_equal_to add a expr_in_range helper?
>>
>> Your function returns true for sizetype vars even if it might be
>> effectively signed, like for
>>
>>  sizetype i_1 = -4;
>>  i_2 = i_1 + 1;
>>
>> operand_unsigned_p (i) returns true.  I suppose you may have


(*)

>> meant
>>
>> +static bool
>> +operand_unsigned_p (tree op)
>> +{
>> +  if (TREE_CODE (op) == SSA_NAME)
>> +    {
>> +      gimple *def = SSA_NAME_DEF_STMT (op);
>> +      if (is_gimple_assign (def))
>> +       {
>> +         tree_code code = gimple_assign_rhs_code (def);
>> +         if (code == NOP_EXPR
>>                && TYPE_PRECISION (TREE_TYPE (op)) > TYPE_PRECISION
>> (TREE_TYPE (gimple_assign_rhs1 (def))))
>>               return tree_expr_nonnegative_p (gimple_assign_rhs1 (def)));
>> +       }
>> +    }
>> +
>> +  return false;
>> +}
>>
>> ?  because only if you do see a cast and that cast is widening from an
>> nonnegative number
>> the final one will be unsigned (if interpreted as signed number).
>
>
> I don't think this is what I want.  Here's a test case that works
> with my function but not with the suggested modification:
>
>    char d[4];
>    long f (unsigned long i)
>    {
>      return __builtin_object_size (d + i + 1, 0);
>    }
>
> Here, the size I'm looking for is (at most) 3 no matter what value
> i has.  Am I missing a case where my function will do the wrong
> thing?

See above (*)

I know what you are trying to do (retro-actively make value-ranges have
a split variable / constant part).  But I don't think that doing it the way you
do it is a reasonable maintainable way.  Others may disagree.

>> You might want to use offset_ints here (see mem_ref_offset for example)
>
>
> Okay, I'll see if I can switch to offset_int.
>
>>
>>>>
>>>> +         gimple *def = SSA_NAME_DEF_STMT (off);
>>>> +         if (is_gimple_assign (def))
>>>> +           {
>>>> +             tree_code code = gimple_assign_rhs_code (def);
>>>> +             if (code == PLUS_EXPR)
>>>> +               {
>>>> +                 /* Handle offset in the form VAR + CST where VAR's
>>>> type
>>>> +                    is unsigned so the offset must be the greater of
>>>> +                    OFFRANGE[0] and CST.  This assumes the PLUS_EXPR
>>>> +                    is in a canonical form with CST second.  */
>>>> +                 tree rhs2 = gimple_assign_rhs2 (def);
>>>>
>>>> err, what?  What about overflow?  Aren't you just trying to decompose
>>>> 'off' into a variable and a constant part here and somehow extracting a
>>>> range for the variable part?  So why not just do that?
>>>
>>>
>>>
>>> Sorry, what about overflow?
>>>
>>> The purpose of this code is to handle cases of the form
>>>
>>>    & PTR [range (MIN, MAX)] + CST
>>
>>
>> what if MAX + CST overflows?
>
>
> The code doesn't look at MAX, only MIN is considered.  It extracts
> both but only actually uses MAX to see if it's dealing with a range
> or a constant.  Does that resolve your concern?

But if MAX overflows then it might be smaller than MIN and thus you
cannot conclude that [min, max] + CST is [min+CST, UNKNWON]
but rather it's [UNKNOWN, UNKNOWN] (if you disregard the actual
value of MAX).

>>>    char d[7];
>>>
>>>    #define bos(p, t) __builtin_object_size (p, t)
>>>
>>>    long f (unsigned i)
>>>    {
>>>      if (2 < i) i = 2;
>>>
>>>      char *p = &d[i] + 3;
>>>
>>>      return bos (p, 0);
>>>    }
>>
>>
>> I'm sure that doesn't work as you match for PLUS_EXPR.
>
>
> Sorry, I'm not sure what you mean.

I mean that p = &d[i] + 3; is not represented as a PLUS_EXPR
but as a POINTER_PLUS_EXPR.  All pointer arithmetic is using
POINTER_PLUS_EXPR.

>  The above evaluates to 4 with
> the patch because i cannot be less than zero (otherwise &d[i] would
> be invalid/undefined) so the type-0 size we want (the maximum) is
> &d[7] - (&d[0] + 3) or 4.
>
>> Maybe simply ignore VR_ANTI_RANGEs for now then?
>
>
> Yes, that makes sense.
>
>>> The code above is based on the observation that an anti-range is
>>> often used to represent the full subrange of a narrower signed type
>>> like signed char (as ~[128, -129]).  I haven't been able to create
>>> an anti-range like ~[5, 9]. When/how would a range like that come
>>> about (so I can test it and implement the above correctly)?
>>
>>
>> if (a < 4)
>>   if (a > 8)
>>     b = a;
>>
>> then b should have ~[5, 9]
>
>
> Right :)  I have figured out by know by know how to create an anti
> range in general.  What I meant is that I haven't had luck creating
> them in a way that the tree-object-size pass could see (I'm guessing
> because EVRP doesn't understand relational expressions).  So given
> this modified example from above:
>
> char d[9];
>
> #define bos(p, t) __builtin_object_size (p, t)
>
> long f (unsigned a)
> {
>    unsigned b = 0;
>
>    if (a < 4)
>      if (a > 8)
>        b = a;
>
>    char *p = &d[b];
>    return bos (p, 0);
> }
>
> The value ranges after Early VRP are:
>
> _1: VARYING
> b_2: VARYING
> a_4(D): VARYING
> p_6: ~[0B, 0B]
> _8: VARYING

Of course - that's because a) I did a mistake, it shoud be if (a < 4 || a> 8),
b) EVRP does not look at DEFs (all the magic in register_edge_assert_for
is not factored out to be usable by EVRP), c) there's no SSA name for
b having the anti-range but we only have the PHI merging that with 0.

A better testcase would be

long f (unsigned a)
{
  unsigned b = 0;

  if (a < 4)
    goto x;
  if (a > 8)
    goto x;
  goto y;

x:
  b = a;

y:;

  char *p = &d[b];
  return bos (p, 0);
}

but even that fails to create the anti-range.  I suppose we have work
to do for EVRP ;)

> But with the removal of the anti-range code this will be a moot
> point.
>
>>> Maybe the poor range info i a consequence of the pass only benefiting
>>> from EVRP and not VRP?
>>
>>
>> The range of 'p' is indeed not known (we only represent integer bound
>> ranges).
>> You seem to want the range of p - &d[0] here, something that is not
>> present
>> in the IL.
>
>
> Yes, that's effectively what this patch does.  Approximate pointer
> ranges.
>
>>> It's just something I haven't had time to work on yet and with the
>>> close of stage 1 approaching I wanted to put out this version for
>>> review.  Do you view this enhancement as prerequisite for approving
>>> the patch or is it something that you'd be fine with adding later?
>>
>>
>> I find the patch adds quite some ad-hoc ugliness to a pass that is
>> already complex and nearly impossible to understand.
>
>
> I'm sorry it looks ugly to you.  I'm afraid I'm not yet familiar
> enough with the code to see the distinction.  I'm just happy when
> I get things to work the way I want them to :)  On this project
> I feel like I've been working around limitations both in the pass
> itself and those caused by when it runs.  I tried to make it run
> later, after VRP, but run into a failure in builtin-object-size-14.c.
> From the history of the test (mostly your comments on bug 59125)
> it seems that there's some delicate interplay between the pass and
> some others that might make changing when it runs tricky business.
> I haven't tried to figure out how to deal with the problems.

Well, me neither -- I still think b_o_s is fundamentally flawed when
facing an optimizing compiler.  Esp. in its modes where it tries
to handle sub-objects.

>> I'm leaving it to others if we have to have this improvement for GCC 7
>> (bos has its own share of know issues besides missing features).
>
>
> Okay, thanks for your feedback.  I agree that the pass is hard to
> read (and harder still to modify).  I also agree that it could be
> improved even beyond this modest enhancement and made more useful.
> Given its complexity it could also do with more documentation.
>
> Unfortunately I'm out of time for GCC 7 to make substantial
> changes to it.  But it seems to me that while this patch may not
> be elegant it does work and makes _FORTIFY_SOURCE a little more
> effective.
>
> Martin

Re: [PATCH] have __builtin_object_size handle POINTER_PLUS with non-const offset (pr 77608)

Reply via email to