Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

Martin Sebor Mon, 06 Nov 2017 18:54:44 -0800

On 10/30/2017 06:02 AM, Richard Biener wrote:

On Sun, Oct 29, 2017 at 5:15 PM, Martin Sebor <mse...@gmail.com> wrote:

Ping -- please see my reply below.



On 10/20/2017 09:57 AM, Richard Biener wrote:


get_addr_base_and_unit_offset will return NULL if there's any


variable


component in 'ref'.  So as written it seems to be dead code (you
want to pass 'arg'?)




Sorry, I'm not sure I understand what you mean.  What do you think
is dead code?  The call to get_addr_base_and_unit_offset() is also
made for an array of unspecified bound (up_bound is null) and for
an array at the end of a struct.  For those the function returns
non-null, and for the others (arrays of runtime bound) it returns
null.  (I passed arg instead of ref but I see no difference in
my tests.)



If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg')


it will


return the offset of 'c'.  If you pass a.b[j].c it will still return


NULL.


You could use get_ref_base_and_extent which will return the offset
of a.b[0].c in this case and sets max_size != size - but you are only
interested in offset.  The disadvantage of get_ref_base_and_extent
is it returns offset in bits thus if the offset is too large for a

HWI


you'll instead get offset == 0 and max_size == -1.

Thus I'm saying this is dead code for variable array accesses
(even for the array you are warning about).  Yes, for constant index
and at-struct-end you'll get sth, but the warning is in VRP because
of variable indexes.

So I suggest to pass 'arg' and use get_ref_base_and_extent
for some extra precision (and possible lossage for very very large
structures).



Computing bit offsets defeats the out-of-bounds flexible array
index detection because the computation overflows (the function
sets the offset to zero).



It shouldn't if you pass arg rather than ref.



I suspect you missed my reply in IRC where I confirmed that this
approach doesn't work for the reason you yourself mentioned above:

The disadvantage of get_ref_base_and_extent
is it returns offset in bits thus if the offset is too large
for a HWI you'll instead get offset == 0 and max_size == -1.


This means that using the function you suggest would either prevent
detecting the out-of-bounds indices that overflow due to the bit
offset, thus largely defeating the purpose of the patch (to detect
excessively large indices), or not printing the value of the out-of
bounds index in the warning in this case, which would in turn mean
further changes to the rest of the logic.


Well, it would not handle the case of

a.b[offset-bringing-you-bitoffset-overflow].c[i]

but it would handle (compared to your version of the patch)

a.b[j].c[i]

with i being the unreasonably large offset.  That's beause we
look at the bit offset of a.b[j].c thus the _start_ of the array.


AFAICS, using get_ref_base_and_extent only results in missing a set
of cases that get_addr_base_and_unit_offset doesn't.  It's entirely
possible I'm missing something but I can find no other difference
or construct a test case where an out-of-bounds index is detected
with the former that isn't with the latter.  Here's an example:

  #define MAX (__PTRDIFF_MAX__ - 256)

  struct A { char a[256]; char ax[]; };

  int f (struct A *p, unsigned long i)
  {
    if (i < MAX + 1)
      i = MAX + 1;     // bug: sizeof *p + i > PTRDIFF_MAX

    return p->ax[i];   // missing -Warray-bounds
  }

This is because the offset of the array returned by
get_ref_base_and_extent when the bit offset computation overflows
is reset to zero.  So the function seems strictly worse in this
context that the other.   Am I missing something that makes this
a worthwhile tradeoff?  (If so, can you please show a test case?)

I still find the half-address-space case totally pointless to warn about
(even more to get "precise" here by subtracting the offset of the array).
There's not a single testcase that looks motivating to me (like an
error with some signed->unsigned conversion and then GCC magically
figuring out a range you'd warn for)


-Warray-bounds is documented to

  ...warn about subscripts to arrays that are always out of bounds.

Excessively large indices that would imply an object in excess of
PTRDIFF_MAX bytes are always out of bounds, so expecting them to
be diagnosed consistently, even in corner cases, is in line with
the documentation (and presumably also the intent) of the warning.

Other compilers (Clang and Intel ICC) already detect more of these
kinds of problems than GCC does.  In addition, C++ requires them
to be diagnosed in constexpr contexts.  Improving GCC to match and
even exceed these other compilers is simple and inexpensive, and
makes GCC a better implementation.  I welcome suggestions to
improve the patch (or submit a follow up one) and I'm open to
arguments that one solution is superior than another but I don't
understand your objection to a working solution on the grounds
that it handles corner cases.  All bugs are corner cases that
someone didn't get right.

Martin

Re: [PATCH] enhance -Warray-bounds to handle strings and excessive indices

Reply via email to