https://github.com/NagyDonat commented:

The commit looks very promising :smile: I added several minor remarks, but 
overall I'm very satisfied with the quality of your changes and I'm relieved 
that I won't have to perform this refactoring.

> I have only 2 failing tests now:
>
>  Clang :: Analysis/NewDelete+MismatchedDeallocator_intersections.cpp
>  Clang :: Analysis/NewDelete-intersections.m
>
> The problem with those is that now CSA reports read of undefined value 
> obtained from malloc. I am not sure how to handle it and why it did work 
> before this. Maybe it was related to wrong usage of BindExpr?

"Why it did work before this?" is always an excellent question 
:upside_down_face: and in this particular case I think I can answer it: 
previously `MallocChecker` only did a `postCall()` callback, so before that 
`free()` (whose definition is not available) was evaluated by 
`conservativeEvalCall()` and that process invalidated the memory region which 
was passed to `free()`, which changed its contents from `UndefinedVal` to an 
unknown symbolic value. Now you provide an `evalCall`, so `free` does not 
invalidate the value stored in its argument and you can see the `UndefinedVal` 
which was placed there by malloc.

To handle this, I think you should add an explicit invalidation step within the 
`evalCall` of freeing functions to signify that the previous knowledge about 
the contents of the released memory region is no longer valid.

More precisely I can imagine three alternatives:
- (1) A "regular" invalidation would introduce fresh symbols to represent the 
data readable through the now-invalid freed pointer. (At first we don't know 
anything about these symbols, but if they participate in conditionals, they can 
become constrained.) This is equivalent to the old behavior of the checker.
- (2) We could fill the region with `UnknownVal`, which is a "we're confused, 
assume the best about it" fallback placeholder. This is very similar to plan 
(1) but if I understand it correctly the `UnknownVal`s won't become constrained.
- (3) We could fill the region with `UndefinedVal`, which is a "this is 
garbage, reading it is an error" aggressive placeholder. In the testcases that 
you highlighted, this wouldn't change anything (because `malloc()` already 
fills the uninitialized memory with `UnknownVal`), but it would ensure that 
after code like
  ```c
    int* p = (int*)malloc(sizeof(int));
    *p = 123;
    free(p);
  ```
  the analyzer knows that `*p` is an undefined possibly-overwritten value (and 
not the concrete value 123).

Among these (3) is the closest representation of the language standard, because 
reading `free`d memory regions can produce arbitrary garbage.

However, if the reporting of use-after-free errors is enabled (within 
MallocChecker.cpp), then the difference between (1)-(3) is irrelevant, because 
use-after-free is checked as a precondition and we never reach the point where 
we actually read the contents of the released memory.

Therefore I think (1) (or perhaps 2??) would be the most user-friendly 
solution, because if the user enables some subcheckers of MallocChecker, but 
disables the use-after-free check for some reason, then they presumably 
wouldn't want to see the "use of undefined value" errors that are essentially 
use-after-free errors.

https://github.com/llvm/llvm-project/pull/106081
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to