earnol planned changes to this revision.
earnol added inline comments.

================
Comment at: clang/lib/StaticAnalyzer/Core/RegionStore.cpp:564-565
 
-  SVal getBindingForElement(RegionBindingsConstRef B, const ElementRegion *R);
+  SVal getBindingForElement(RegionBindingsConstRef B, const ElementRegion *R,
+                            QualType T = QualType());
 
----------------
NoQ wrote:
> There's a bit of a confusing tradition here, mostly documented in my unsorted 
> long rants in earlier patches, let me try to explain.
> 
> `RegionStore` was built to recognize that type punning is a thing. You can 
> write a value while treating the target location/region R as if it's supposed 
> to hold a value of type T₁ and then load the value by treating the same 
> memory location as if it holds a value of type T₂≠T₁. You can achieve it by 
> having unions or by reinterpret-casting pointers. RegionStore needs to handle 
> this reasonably well, no matter what T₁ and T₂ are. Even though strict 
> aliasing rules put some restrictions on what T₁ and T₂ could be, the analyzer 
> is still required to work correctly when the programmer actively relies on 
> `-fno-strict-aliasing`.
> 
> On top of that, every `MemRegion` R can have a type T(R) associated with it 
> (accessed as `R->getValueType()`). This was probably a bad idea from the 
> start, for at least three reasons:
> - Again, type punning exists. Just because the region has a type, doesn't 
> mean it holds a value of that type.
> - In many cases we deal with regions with no associated type. For example, an 
> unknown (symbolic) pointer of type `void *` definitely points to a certain 
> memory region (we call it `SymbolicRegion`), but the type of the object in 
> that region is most certainly not `void`.
> - Finally, polymorphism: C++ pointers to base class type may actually point 
> to objects of any derived class type. Not only the specific derived type is 
> often unknown, but also the true derived type may not be necessarily 
> available in the current translation unit, so we often don't have the AST to 
> even describe it.
> 
> This makes the type of `MemRegion` largely immaterial and hard to rely upon. 
> Which is why `State->getSVal(R)` accepts an optional parameter T₂ which may 
> be different from both T₁ and T(R).
> 
> However, after some back-and-forth bugfixing, we've settled on agreeing that 
> when T(R) exists, it doesn't make sense to pass any T₂ into 
> `State->getSVal(R, T₂)` other than `T₂=T(R)`. Basically, if R already carries 
> a type, then you don't have a good reason to pass a different type as T₂, 
> given that you can always pass a different region R' instead, that represents 
> the same memory location as R but has the desired value type T₂. So T₂ is 
> only required when it's fundamentally impossible to discover the type from 
> the region. So in order to eliminate the confusion, `getSVal()` tries to get 
> rid of the extra parameter `T₂` as early as possible, and have a single 
> source of truth going further down the call stack.
> 
> I honestly no longer think this is a healthy strategy long-term. Ideally I 
> think we should get rid of region types entirely; they cause more problems 
> than they solve. But that's a much bigger change, and I'm worried that a 
> partial step in this direction, not supported by any strategic vision, will 
> only multiply confusion.
> 
> So, long story short, given that `ElementRegion`s are always typed, I'm 
> wondering if it's easier to simply recast the region in your case as well, 
> instead of reintroducing such duplication of types.
You are right.
Let me explain the situation. The initialization list does not always correct 
and full initialize variable in it's entirety. This is the issue i want to 
address. 
You may say having `T(R)` is good enough and in case of functions like `memcpy` 
it exists always, but the problem i want to address is whether the original 
memory region was fully and properly initialized or not. 
The proposed solution is neither full nor complete, it just slightly improve 
situation when we have structs and array intertwined.
So in some cases existence `T(R)` is not something what we want  as `T₂` is 
indeed different.
Please also allow me amend the statement: `ElementRegions` are always typed. 
While they are do typed they are not always correctly typed.
For example what type will have following initialization list: `{ {1, 2}, {3,4} 
}`?
It can be array of 2 structures, or just 1 structure with 2 structures inside 
or it can be 2-dimentsional array. For this purpose we need to know true `T₂` 
while constructing and checking relevant SVal.



================
Comment at: clang/lib/StaticAnalyzer/Core/RegionStore.cpp:1920
+      if (ElemExpr) {
+        return svalBuilder.makeCompoundVal(VarT, InitList);
+      }
----------------
NoQ wrote:
> The lifetime of `InitList` is the same as lifetime of `factory`. Because 
> `factory` is a local variable, this is going to be a use-after-free. You need 
> to use the factory that lives inside `svalBuilder` instead, accessible 
> through methods `getEmptySValList ()` and `prependSVal()` of 
> `svalBuilder.getBasicValueFactory()`.
Thank you for the great pointer. It makes sense to make changes this way.
From my perspective `makeCompoundVal `accepted second parameter by value which 
made me bold enough to use local `InitList`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144977/new/

https://reviews.llvm.org/D144977

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to