On Fri, Nov 11, 2016 at 4:56 PM, Martin Sebor <mse...@gmail.com> wrote:
> Thanks for the review and comments!

First of all sorry for the late response.

>>
>> @@ -158,14 +170,149 @@ compute_object_offset (const_tree expr, const_tree
>> var)
>>    return size_binop (code, base, off);
>>  }
>>
>> +static bool
>> +operand_unsigned_p (tree op)
>> +{
>> +  if (TREE_CODE (op) == SSA_NAME)
>>
>> new functions need a comment.  But maybe you want to use
>> tree_expr_nonnegative_p
>> to also allow signed but known positive ones?
>
>
> Let me add a comment.
>
> operand_unsigned_p returns true if the type of the original offset
> before conversion to sizetype is unsigned.  tree_expr_nonnegative_p
> returns true if the argument's type is unsigned, which is always
> the case here.

Sure - but then you maybe instead want to check for op being in
range [0, max-of-signed-type-of-op] instead?  So similar to
expr_not_equal_to add a expr_in_range helper?

Your function returns true for sizetype vars even if it might be
effectively signed, like for

 sizetype i_1 = -4;
 i_2 = i_1 + 1;

operand_unsigned_p (i) returns true.  I suppose you may have
meant

+static bool
+operand_unsigned_p (tree op)
+{
+  if (TREE_CODE (op) == SSA_NAME)
+    {
+      gimple *def = SSA_NAME_DEF_STMT (op);
+      if (is_gimple_assign (def))
+       {
+         tree_code code = gimple_assign_rhs_code (def);
+         if (code == NOP_EXPR
               && TYPE_PRECISION (TREE_TYPE (op)) > TYPE_PRECISION
(TREE_TYPE (gimple_assign_rhs1 (def))))
              return tree_expr_nonnegative_p (gimple_assign_rhs1 (def)));
+       }
+    }
+
+  return false;
+}

?  because only if you do see a cast and that cast is widening from an
nonnegative number
the final one will be unsigned (if interpreted as signed number).

>>
>> +/* Fill the 2-element OFFRANGE array with the range of values OFF
>> +   is known to be in.  Postcodition: OFFRANGE[0] <= OFFRANGE[1].  */
>> +
>> +static bool
>> +get_offset_range (tree off, HOST_WIDE_INT offrange[2])
>> +{
>> +  STRIP_NOPS (off);
>>
>> why strip nops (even sign changes!) here?
>
>
> That might be a leftover from an earlier/failed attempt to simplify
> things that I forgot to remove.  Let me do that in a followup patch.
> Unless I misunderstand your comment there are no sign changes (AFAIK)
> because the offset is always represented as sizetype.
>
>> Why below convert things
>> via to_uhwi when offrange is of type signed HOST_WIDE_INT[2]?
>
>
> The offset is always represented as sizetype but the code treats
> it as signed because in reality it can be negative.  That said,
> I don't find dealing with ranges very intuitive so I could very
> well be missing something and there may be a better way to code
> this.  I welcome suggestions.

You might want to use offset_ints here (see mem_ref_offset for example)

>>
>> +         gimple *def = SSA_NAME_DEF_STMT (off);
>> +         if (is_gimple_assign (def))
>> +           {
>> +             tree_code code = gimple_assign_rhs_code (def);
>> +             if (code == PLUS_EXPR)
>> +               {
>> +                 /* Handle offset in the form VAR + CST where VAR's type
>> +                    is unsigned so the offset must be the greater of
>> +                    OFFRANGE[0] and CST.  This assumes the PLUS_EXPR
>> +                    is in a canonical form with CST second.  */
>> +                 tree rhs2 = gimple_assign_rhs2 (def);
>>
>> err, what?  What about overflow?  Aren't you just trying to decompose
>> 'off' into a variable and a constant part here and somehow extracting a
>> range for the variable part?  So why not just do that?
>
>
> Sorry, what about overflow?
>
> The purpose of this code is to handle cases of the form
>
>    & PTR [range (MIN, MAX)] + CST

what if MAX + CST overflows?

Note pointer plus int is a POINTER_PLUS_EXPR, not a PLUS_EXPR.
Only for a POINTER_PLUS_EXPR you might argue that overflow is
undefined.

> where CST is unsigned implying that the lower bound of the offset
> is the greater of CST and MIN.  For instance, in the following it
> determines that bos(p, 0) is 4 (and if the 3 were greater than 7
> and overflowed the addition the result would be zero).  I'm not
> sure I understand what you suggest I do differently to make this
> work.
>
>    char d[7];
>
>    #define bos(p, t) __builtin_object_size (p, t)
>
>    long f (unsigned i)
>    {
>      if (2 < i) i = 2;
>
>      char *p = &d[i] + 3;
>
>      return bos (p, 0);
>    }

I'm sure that doesn't work as you match for PLUS_EXPR.

>>
>>
>> +      else if (range_type == VR_ANTI_RANGE)
>> +       {
>> +         offrange[0] = max.to_uhwi () + 1;
>> +         offrange[1] = min.to_uhwi () - 1;
>> +         return true;
>> +       }
>>
>> first of all, how do you know it fits uhwi?  Second, from ~[5, 9] you get
>> [10, 4] !?  That looks bogus (and contrary to the function comment
>> postcondition)
>
>
> I admit I have some trouble working with anti-ranges.  It's also
> difficult to fully exercise them in this pass because it only runs
> after EVRP but not after VRP1 (except with -g), so only limited
> range information is available. (I'm hoping to eventually change
> it but moving the passes broke a test in a way that seemed too
> complex to fix for this project).

Maybe simply ignore VR_ANTI_RANGEs for now then?

> The code above is based on the observation that an anti-range is
> often used to represent the full subrange of a narrower signed type
> like signed char (as ~[128, -129]).  I haven't been able to create
> an anti-range like ~[5, 9]. When/how would a range like that come
> about (so I can test it and implement the above correctly)?

if (a < 4)
  if (a > 8)
    b = a;

then b should have ~[5, 9]

Note a trick VRP uses internally is to treat an anti-range as
the union of two ranges.  ~[X, Y] == [MIN, X-1] u [Y+1, MAX].
That essentially removes anti-range handling from VRPs
propagation - not sure if it would help your case.

>>
>> +      else if (range_type == VR_VARYING)
>> +       {
>> +         gimple *def = SSA_NAME_DEF_STMT (off);
>> +         if (is_gimple_assign (def))
>> +           {
>> +             tree_code code = gimple_assign_rhs_code (def);
>> +             if (code == NOP_EXPR)
>> +               {
>>
>> please trust range-info instead of doing your own little VRP here.
>> VR_VARYING -> return false.
>
>
> I would prefer to rely on the range information and not have to work
> around it like I do here but, unfortunately, it doesn't always appear
> to be available.  For example, in the following test case:
>
>    char d[130];
>
>    #define bos(p, t) __builtin_object_size (p, t)
>
>    void f (void*);
>
>    void g (signed char i)
>    {
>       char *p = &d [i] + 128;
>      f(__builtin___memset_chk (p, '*', 2, bos (p, 0)));   // okay
>
>      f(__builtin___memset_chk (p, '*', 3, bos (p, 0)));   // overflow
>    }
>
> the range type is VR_VARYING but the code makes it possible to diagnose
> the overflow in the second call to memset.
>
> Maybe the poor range info i a consequence of the pass only benefiting
> from EVRP and not VRP?

The range of 'p' is indeed not known (we only represent integer bound ranges).
You seem to want the range of p - &d[0] here, something that is not present
in the IL.

>>
>> stopping review here noting that other parts of the compiler
>> split constant parts from variable parts and it looks like this is
>> what you want to do here?  That is, enhance
>>
>> static vec<unsigned HOST_WIDE_INT> object_sizes[4];
>>
>> and cache a SSA-NAME, constant offset pair in addition?  Or just a range
>> (range of that SSA name + offset), if that's good enough -- wait, that's
>> what you get from get_range_info!
>
>
> I agree that storing the offsets could be an enhancement to consider.
>  The patch mentions it in the FIXME part of the following comment:
>
> /* Bitmaps of variables whose object sizes have been computed without
>    regard to the (non-constant) offset into them.  A bit in each bitmap
>    is valid only if the corresponding bit in the COMPUTED bitmap is
>    non-zero (otherwise it's zero).
>    FIXME: Change this to a vector of offset ranges to make it possible
>    to handle cases like &A[I] + J where both I and J are non-const and
>    bounded by some range.  */
> static bitmap nonconst_offsets[4];
>
> It's just something I haven't had time to work on yet and with the
> close of stage 1 approaching I wanted to put out this version for
> review.  Do you view this enhancement as prerequisite for approving
> the patch or is it something that you'd be fine with adding later?

I find the patch adds quite some ad-hoc ugliness to a pass that is
already complex and nearly impossible to understand.

>>
>> So I'm not sure where the whole complication in the patch comes from...
>>
>> Possibly from the fact that VRP on pointers is limited and thus &a[i] + 5
>> doesn't get you a range for i + 5?
>
>
> get_range_info doesn't accept pointers at all (perhaps that's what
> you meant by "VRP on pointers is limited").  In Gimple, &a[i] + 5
> is represented as just that (i.e., _1 = &a[i_3]; p_6 = _1 + 5;) and
> so to get the right answer for bos(&a[i] + 5) the patch jumps though
> some hoops.  But again, I could very well be missing something that's
> obvious to you so if you think that's the case I'd be happy to simplify
> the code if you point me in the right direction.

Yes, get_range_info is limited.  IMHO of you want to enhance object-size
to cover ranges by implicitely working on a different IL then it probably
should be re-written with that in mind.  The EVRP pass provides a
good template for how to re-use the VRP machinery and decomposing
the lattice a bit further into BASE + range shouldn't be difficult.

I'm leaving it to others if we have to have this improvement for GCC 7
(bos has its own share of know issues besides missing features).

Richard.

>
> Martin

Reply via email to