Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

Richard Biener Fri, 20 Oct 2017 08:57:46 -0700

On October 20, 2017 5:43:40 PM GMT+02:00, Martin Sebor <mse...@gmail.com> wrote:
>On 10/20/2017 02:08 AM, Richard Biener wrote:
>> On Fri, Oct 20, 2017 at 1:00 AM, Martin Sebor <mse...@gmail.com>
>wrote:
>>> On 10/19/2017 02:34 AM, Richard Biener wrote:
>>>>
>>>> On Thu, Oct 19, 2017 at 1:19 AM, Martin Sebor <mse...@gmail.com>
>wrote:
>>>>>
>>>>> On 10/18/2017 04:48 AM, Richard Biener wrote:
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 18, 2017 at 5:34 AM, Martin Sebor <mse...@gmail.com>
>wrote:
>>>>>>>
>>>>>>>
>>>>>>> While testing my latest -Wrestrict changes I noticed a number of
>>>>>>> opportunities to improve the -Warray-bounds warning.  Attached
>>>>>>> is a patch that implements a solution for the following subset
>>>>>>> of these:
>>>>>>>
>>>>>>> PR tree-optimization/82596 - missing -Warray-bounds on an out-of
>>>>>>>   bounds index into string literal
>>>>>>> PR tree-optimization/82588 - missing -Warray-bounds on an
>excessively
>>>>>>>   large index
>>>>>>> PR tree-optimization/82583 - missing -Warray-bounds on
>out-of-bounds
>>>>>>>   inner indices
>>>
>>>
>>>> I meant to use size_type_node (size_t), not sizetype.  But
>>>> I just checked that ptrdiff_type_node is initialized in
>>>> build_common_tree_nodes and thus always available.
>>>
>>>
>>> I see.  Using ptrdiff_type_node is preferable for the targets
>>> where ptrdiff_t has a greater precision than size_t (e.g., VMS).
>>> It makes sense now.  I should remember to change all the other
>>> places where I introduced ssizetype to use ptrdiff_type_node.
>>>
>>>>
>>>>> As an aside, at some point I would like to get away from a type
>>>>> based limit in all these warnings and instead use one that can
>>>>> be controlled by an option so that a user can impose a lower limit
>>>>> on the maximum size of an object and have all size-related
>warnings
>>>>> (and perhaps even optimizations) enforce it and benefit from it.
>>>>
>>>>
>>>> You could add a --param that is initialized from ptrdiff_type_node.
>>>
>>>
>>> Yes, that's an option to consider.  Thanks.
>>>
>>>
>>>>
>>>>>> +      tree arg = TREE_OPERAND (ref, 0);
>>>>>> +      tree_code code = TREE_CODE (arg);
>>>>>> +      if (code == COMPONENT_REF)
>>>>>> +       {
>>>>>> +         HOST_WIDE_INT off;
>>>>>> +         if (tree base = get_addr_base_and_unit_offset (ref,
>&off))
>>>>>> +           up_bound_p1 = fold_build2 (MINUS_EXPR, ssizetype,
>>>>>> up_bound_p1,
>>>>>> +                                      TYPE_SIZE_UNIT (TREE_TYPE
>>>>>> (base)));
>>>>>> +         else
>>>>>> +           return;
>>>>>>
>>>>>> so this gives up on a.b[i].c.d[k] (ok, array_at_struct_end_p will
>be
>>>>>> false).
>>>>>> simply not subtracting anyhing instead of returning would be
>>>>>> conservatively
>>>>>> correct, no?  Likewise subtracting the offset of the array for
>all
>>>>>> "previous"
>>>>>> variably indexed components with assuming the lowest value for
>the
>>>>>> index.
>>>>>> But as above I think compensating for the offset of the array
>within the
>>>>>> object
>>>>>> is academic ... ;)
>>>>>
>>>>>
>>>>>
>>>>> I was going to say yes (it gives up) but on second thought I don't
>>>>> think it does.  Only the major index can be unbounded and the code
>>>>> does consider the size of the sub-array when checking the major
>>>>> index.  So, IIUC, I think this works correctly as is (*).  What
>>>>> doesn't work is VLAs but those are a separate problem.  Let me
>>>>> know if I misunderstood your question.
>>>>
>>>>
>>>> get_addr_base_and_unit_offset will return NULL if there's any
>variable
>>>> component in 'ref'.  So as written it seems to be dead code (you
>>>> want to pass 'arg'?)
>>>
>>>
>>> Sorry, I'm not sure I understand what you mean.  What do you think
>>> is dead code?  The call to get_addr_base_and_unit_offset() is also
>>> made for an array of unspecified bound (up_bound is null) and for
>>> an array at the end of a struct.  For those the function returns
>>> non-null, and for the others (arrays of runtime bound) it returns
>>> null.  (I passed arg instead of ref but I see no difference in
>>> my tests.)
>>
>> If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg')
>it will
>> return the offset of 'c'.  If you pass a.b[j].c it will still return
>NULL.
>> You could use get_ref_base_and_extent which will return the offset
>> of a.b[0].c in this case and sets max_size != size - but you are only
>> interested in offset.  The disadvantage of get_ref_base_and_extent
>> is it returns offset in bits thus if the offset is too large for a
>HWI
>> you'll instead get offset == 0 and max_size == -1.
>>
>> Thus I'm saying this is dead code for variable array accesses
>> (even for the array you are warning about).  Yes, for constant index
>> and at-struct-end you'll get sth, but the warning is in VRP because
>> of variable indexes.
>>
>> So I suggest to pass 'arg' and use get_ref_base_and_extent
>> for some extra precision (and possible lossage for very very large
>> structures).
>
>Computing bit offsets defeats the out-of-bounds flexible array
>index detection because the computation overflows (the function
>sets the offset to zero).


It shouldn't if you pass arg rather than ref. 

  I'll go ahead with the first version
>unless you have a different suggestion.
>
>Thanks
>Martin
>
>>
>> Thus instead of
>>
>> +      tree maxbound = TYPE_MAX_VALUE (ptrdiff_type_node);
>> +
>> +      up_bound_p1 = int_const_binop (TRUNC_DIV_EXPR, maxbound,
>eltsize);
>> +
>> +      tree arg = TREE_OPERAND (ref, 0);
>> +      tree_code code = TREE_CODE (arg);
>> +      if (code == COMPONENT_REF)
>> +       {
>> +         HOST_WIDE_INT off;
>> +         if (tree base = get_addr_base_and_unit_offset (arg, &off))
>> +           {
>> +             tree size = TYPE_SIZE_UNIT (TREE_TYPE (base));
>>
>> (not sure why you're subtracting the size of the base here?!  I
>> thought you subtract off!)
>>
>> +             if (TREE_CODE (size) == INTEGER_CST)
>> +               up_bound_p1 = int_const_binop (MINUS_EXPR,
>up_bound_p1, size);
>> +           }
>>
>> do
>>
>>     if (handled_component_p (arg))
>>       {
>>          HOST_WIDE_INT bitoff, bitsize, bitmax_size;
>>          bool reverse;
>>          get_ref_base_and_extent (arg, &bitoff, &bitsize,
>> &bitmax_size, &reverse);
>>          up_bound_p1 = int_const_binop (MINUS_EXPR, up_bound_p1,
>>                               build_int_cst (TREE_TYPE (up_bound_p1),
>> bitoff / BITS_PER_UNIT));
>>       }
>>
>> ok with that change.
>>
>> Richard.
>>
>>>
>>>>
>>>> I was asking you to remove the 'else return' because w/o
>subtracting
>>>> the upper bound is just more conservative.
>>>
>>>
>>> Sure, that sounds good.
>>>
>>> Attached is an updated patch.
>>>
>>> Thanks
>>> Martin

Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

Reply via email to