[PATCH] D158707: [analyzer] Fix a few size-type signedness inconsistency related to DynamicExtent

Donát Nagy via Phabricator via cfe-commits Mon, 28 Aug 2023 07:21:28 -0700

donat.nagy added a comment.

In D158707#4621135 <https://reviews.llvm.org/D158707#4621135>, @steakhal wrote:


> Oh, so we would canonicalize towards a signed extent type. I see. I think I'm 
> okay with that.
> However, I think this needs a little bit of polishing and testing when the 
> region does not point to the beginning of an array or object, but rather 
> inside an object, or like this:
>
>   int nums[] = {1,2,3};
>   char *p = (char*)&nums[1];
>   
>   clang_analyzer_dumpExtent(p);
>   clang_analyzer_dumpElementCount(p);
>   ++p;
>   clang_analyzer_dumpExtent(p);
>   clang_analyzer_dumpElementCount(p);
>   ++p;
>   int *q = (int*)p;
>   clang_analyzer_dumpExtent(q);
>   clang_analyzer_dumpElementCount(q);

Unfortunately the analyzer engine does not distinguish pointer arithmetic and 
element access, and we cannot handle these situations while that limitation 
exists. For example, if we have an array `int nums[3];`, then the element 
`nums[1]` and the incremented pointer `int *ptr = nums + 1;` are represented by 
the same `ElementRegion` object; so we cannot simultaneously say that (1) 
`nums[1]` is an int-sized memory region, and (2) it's valid to access the 
elements of the array as `ptr[-1]`, `ptr[0]` and `ptr[1]`.

The least bad heuristic is probably `RegionRawOffsetV2::computeOffset()` from 
ArrayBoundCheckerV2, which strips away all the `ElementRegion` layers, acting 
as if they are all coming from pointer arithmetic. This behavior is incorrect 
if we want to query the extent/elementcount of a "real" ElementRegion (e.g. a 
slice of a multidimensional array or `(char*)&nums[1]` in your example), but 
mostly leads to false negatives. On the other hand, if we process a pointer 
increment as if it were an element access, we can easily run into false 
positive reports -- this is why I had to abandon D150446 
<https://reviews.llvm.org/D150446>.

I'd suggest that we should ignore pointer arithmetic in this commit (or create 
a testcase that documents the current insufficient behavior) and increment the 
priority of a through rewrite of the memory model. This is related to the plan 
https://github.com/llvm/llvm-project/issues/42709 suggested by @NoQ, but 
perhaps it would be enough to do a less drastic change in that direction.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158707/new/

https://reviews.llvm.org/D158707

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D158707: [analyzer] Fix a few size-type signedness inconsistency related to DynamicExtent

Reply via email to