[PATCH] D158707: [analyzer] Fix a few size-type signedness inconsistency related to DynamicExtent

Ding Fei via Phabricator via cfe-commits Tue, 29 Aug 2023 09:11:11 -0700

danix800 added a comment.

In D158707#4621377 <https://reviews.llvm.org/D158707#4621377>, @steakhal wrote:


> In D158707#4621270 <https://reviews.llvm.org/D158707#4621270>, @donat.nagy 
> wrote:
>
>> In D158707#4621135 <https://reviews.llvm.org/D158707#4621135>, @steakhal 
>> wrote:
>>
>>> Oh, so we would canonicalize towards a signed extent type. I see. I think 
>>> I'm okay with that.
>>> However, I think this needs a little bit of polishing and testing when the 
>>> region does not point to the beginning of an array or object, but rather 
>>> inside an object, or like this:
>>>
>>>   int nums[] = {1,2,3};
>>>   char *p = (char*)&nums[1];
>>>   
>>>   clang_analyzer_dumpExtent(p);
>>>   clang_analyzer_dumpElementCount(p);
>>>   ++p;
>>>   clang_analyzer_dumpExtent(p);
>>>   clang_analyzer_dumpElementCount(p);
>>>   ++p;
>>>   int *q = (int*)p;
>>>   clang_analyzer_dumpExtent(q);
>>>   clang_analyzer_dumpElementCount(q);
>>
>> Unfortunately the analyzer engine does not distinguish pointer arithmetic 
>> and element access, and we cannot handle these situations while that 
>> limitation exists. For example, if we have an array `int nums[3];`, then the 
>> element `nums[1]` and the incremented pointer `int *ptr = nums + 1;` are 
>> represented by the same `ElementRegion` object; so we cannot simultaneously 
>> say that (1) `nums[1]` is an int-sized memory region, and (2) it's valid to 
>> access the elements of the array as `ptr[-1]`, `ptr[0]` and `ptr[1]`.
>>
>> The least bad heuristic is probably `RegionRawOffsetV2::computeOffset()` 
>> from ArrayBoundCheckerV2, which strips away all the `ElementRegion` layers, 
>> acting as if they are all coming from pointer arithmetic. This behavior is 
>> incorrect if we want to query the extent/elementcount of a "real" 
>> ElementRegion (e.g. a slice of a multidimensional array or `(char*)&nums[1]` 
>> in your example), but mostly leads to false negatives. On the other hand, if 
>> we process a pointer increment as if it were an element access, we can 
>> easily run into false positive reports -- this is why I had to abandon 
>> D150446 <https://reviews.llvm.org/D150446>.
>>
>> I'd suggest that we should ignore pointer arithmetic in this commit (or 
>> create a testcase that documents the current insufficient behavior) and 
>> remember this as yet another reason to do through rewrite of the memory 
>> model. This is related to the plan 
>> https://github.com/llvm/llvm-project/issues/42709 suggested by @NoQ, but 
>> perhaps it would be enough to do a less drastic change in that direction.
>
> I'm aware of this, and I didn't mean to ask to fix this here.
> I just want to see how these ExprInspection APIs will be affected in these 
> edge-cases, where previously it returned "unknown", and now will return 
> something else.
> It's useful to demonstrate this if we ever come back to this commit.
> Pinning down some tests and having some FIXMEs there should bring enough 
> visibility.

Thank you all for the above valuable information. It's helpful for me to get 
better understanding of CSA!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158707/new/

https://reviews.llvm.org/D158707

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D158707: [analyzer] Fix a few size-type signedness inconsistency related to DynamicExtent

Reply via email to