Charusso added a comment.

Thanks for the notes!

In D68725#1722136 <https://reviews.llvm.org/D68725#1722136>, @NoQ wrote:

> Generally, there's no need to add more information to `MemRegion` and 
> `SymExpr` objects [...]


Well, that was my first idea, then I saw we allow helper-methods inside 
regions. `getExtent()` has been deprecated on my side.

> Even if `AllocaRegion` didn't store the expression, any path-sensitive 
> checker for which such region is "interesting" would have to implement a bug 
> visitor to display the allocation site. Such visitor automatically knows the 
> location of the `alloca()` expression and can emit the necessary fixits.

I would create two type of `MemoryBlockRegions`, one for expressions: 
`malloc()`, one for declarations: `std::string str`. The rational behind the 
latter to make `CStringChecker` as simply `StringChecker`, with modelling the 
concept: `str.size()` is the dynamic size behind `VarRegion(str)`. My idea here 
to make it similar to `getDynamicTypeInfo` so I can obtain the allocated memory 
block's capacity, element count, used size easily in every checker. This 
information could simplify `MallocChecker` a lot, because the allocated size of 
memory could serve the allocation state. The "used size" is heuristically 
modeled in `CStringChecker`, so it could be simplified, and my new checker 
would be interested in the element-count only. Also I am thinking of the 
`InnerBufferChecker` could connect to a `MemoryBlockRegion` as all the views 
abstracts a block of memory.

This new model is basically a new middle-layer in the hierarchy, for example: 
`std::string str` -> `VarRegion(str)` has a memory block behind the scenes, 
which belongs to the generic heap. This means, I cannot introduce a new type of 
`MemSpaceRegion`, as a memory block is a subset of memory space. Both the 
`Decl` and `Expr` based regions produce dynamically changing stuff, that is why 
I would use the concept of `AllocaRegion` with an `Expr` or a `Decl` of the 
allocation as a known identifier.

I cannot write that down: `str.size() == getDynamicSize(str)` or `char *cstr; 
strlen(cstr) == getDynamicSize(cstr)`, where the VarRegion being a pointer to 
the data has a size of a pointer, but I am interested in the block of memory it 
points to. So, even I could remove the entire `AllocaRegion`, I think it is 
great to distinguish between a pointer and a block of memory. Also this entire 
dynamic-size business would be based on `MemoryBlockRegions`, which I believe a 
good way to isolate a new concept with creating new types of regions, and not 
to stress the current model.

I have mixed feelings how would we model a `string`, and connect it with the 
current model. I think that would be neat to express like that:
`malloc(str.size() + 1)` == 
`MemoryBlockExprRegion(getDynamicSize(MemoryBlockDeclRegion(VarRegion(str))) + 
1)`, I believe.

Here is an example why I am interested in strings. Here is a function from 
LevelDB which does not inject the null-terminator:
(https://github.com/google/leveldb/blob/master/db/c.cc)

  static char* CopyString(const std::string& str) {
    char* result = reinterpret_cast<char*>(malloc(sizeof(char) * str.size()));
    memcpy(result, str.data(), sizeof(char) * str.size());
    return result;
  }

The proper way to copy a string is:

  static char* CopyString(const std::string& str) {
    char* result = reinterpret_cast<char*>(malloc(str.size() + 1));
    strcpy(result, str.data());
    return result;
  }

Also, this would be the first step to handle everything in the `string.c` test 
file, which is not the goal. The goal is the size of a memory block.



================
Comment at: clang/test/Analysis/pr22954.c:628
   clang_analyzer_eval(m29[j].s3[k] == 1); // expected-warning{{TRUE}}
+  // expected-warning@-1 {{Potential leak of memory pointed to by field 's4'}}
   clang_analyzer_eval(l29->s1[m] == 2); // expected-warning{{UNKNOWN}}
----------------
NoQ wrote:
> See the other part of the FIXME: `But not on the previous line`! The symbol 
> is not yet dead on line 628; i specifically left this comment around because 
> this test often gets "fixed" incorrectly.
Well, this is not on the previous line, but I will mark this as unresolved 
then. In my mind with that new concept we see that we have an eternal block of 
memory, which we needs to explicitly `free()`.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68725/new/

https://reviews.llvm.org/D68725



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to