On 10/29/2017 10:15 AM, Martin Sebor wrote: > Ping -- please see my reply below. > > On 10/20/2017 09:57 AM, Richard Biener wrote: >>>>>> get_addr_base_and_unit_offset will return NULL if there's any >>> variable >>>>>> component in 'ref'. So as written it seems to be dead code (you >>>>>> want to pass 'arg'?) >>>>> >>>>> >>>>> Sorry, I'm not sure I understand what you mean. What do you think >>>>> is dead code? The call to get_addr_base_and_unit_offset() is also >>>>> made for an array of unspecified bound (up_bound is null) and for >>>>> an array at the end of a struct. For those the function returns >>>>> non-null, and for the others (arrays of runtime bound) it returns >>>>> null. (I passed arg instead of ref but I see no difference in >>>>> my tests.) >>>> >>>> If you pass a.b.c[i] it will return NULL, if you pass a.b.c ('arg') >>> it will >>>> return the offset of 'c'. If you pass a.b[j].c it will still return >>> NULL. >>>> You could use get_ref_base_and_extent which will return the offset >>>> of a.b[0].c in this case and sets max_size != size - but you are only >>>> interested in offset. The disadvantage of get_ref_base_and_extent >>>> is it returns offset in bits thus if the offset is too large for a >>> HWI >>>> you'll instead get offset == 0 and max_size == -1. >>>> >>>> Thus I'm saying this is dead code for variable array accesses >>>> (even for the array you are warning about). Yes, for constant index >>>> and at-struct-end you'll get sth, but the warning is in VRP because >>>> of variable indexes. >>>> >>>> So I suggest to pass 'arg' and use get_ref_base_and_extent >>>> for some extra precision (and possible lossage for very very large >>>> structures). >>> >>> Computing bit offsets defeats the out-of-bounds flexible array >>> index detection because the computation overflows (the function >>> sets the offset to zero). >> >> It shouldn't if you pass arg rather than ref. > > I suspect you missed my reply in IRC where I confirmed that this > approach doesn't work for the reason you yourself mentioned above: > >>>> The disadvantage of get_ref_base_and_extent >>>> is it returns offset in bits thus if the offset is too large >>>> for a HWI you'll instead get offset == 0 and max_size == -1. > > This means that using the function you suggest would either prevent > detecting the out-of-bounds indices that overflow due to the bit > offset, thus largely defeating the purpose of the patch (to detect > excessively large indices), or not printing the value of the out-of > bounds index in the warning in this case, which would in turn mean > further changes to the rest of the logic. I think Richi's point is that it's more important (in his opinion) to handle the varying objects within an access like a.b.c[i] than to handle cases that would overflow when converted to bits. Thus he'd prefer to use get_ref_base_and_extent over get_addr_base_and_unit_offset.
I wonder if a hybrid approach here would work. ie, use get_ref_base_and_extent per Richi's request. When that returns an max_size of -1, then call get_addr_base_and_unit_offset and if it returns that the offset is huge, then warn without any deeper analysis. Yea, this could trip a false positive if someone makes an array that is half the address space (give or take), but that's probably not at all common. Whereas the additional cases handled by get_ref_base_and_extent are perhaps more useful. Thoughts? jeff