NoQ created this revision.
NoQ added reviewers: dcoughlin, xazax.hun, a.sidorin, george.karpenkov, szepet,
rnkovacs.
Herald added subscribers: cfe-commits, baloghadamsoftware.
As the old assertion says,
> "Thou shalt not initialize the same memory twice."
Namely, the `State->bindDefault()`/`StoreManager.BindDefault()` API has the
purpose of overriding the initial value with which memory is initialized. For
instance, a checker can express that loading from freshly malloc()ed memory
would yield Undefined, while loading from freshly calloc()ed memory would yield
zero, even though loading from a normal symbolic region would yield a
region-value symbol. `BindDefault` never was meant for binding arbitrary
default values to arbitrary locations (though it was never doxygen'ed
anywhere). For that reason it has the aforementioned assertion.
The assertion is actually a bit more relaxed than that; it simply says that the
new default binding should never overwrite an existing binding key, which means
that not only the base region but also the offset should coincide in order to
trigger the assertion.
Other sorts of default store bindings are created by different APIs. Namely,
there's also the invalidation case, which consists in default-binding a
conjured symbol, and it is performed by
`State->invalidateRegions()`/`StoreManager.invalidateRegions()` which never
calls `BindDefault` but instead manipulates bindings directly.
Now, when it comes to C++ zeroing constructors (i.e. non-user-supplied
constructors that boil down to zero initialization of the object), a lot of
weird things can happen, especially when such constructors are applied to
fields or base objects, especially with multiple inheritance, especially with
default inheritance from empty objects (objects with no fields, eg. a-la java
interfaces). The same base object would undergo multiple initializations for
every zeroing constructor of a field or a base. And due to empty fields or
bases, it may happen that such bindings will have the exact same offset. It may
also happen that the new default binding would conflict an old direct binding,
as i demonstrated on one of the newly added tests.
It seems that C++ zero-initialization is quite different from normal
`BindDefault`'s idea of setting the initial/default value. Therefore i propose
to split up the `BindDefault` API into two parts: `BindDefaultInitial()` for
setting initial/default values and `BindDefaultZero()` for performing C++-like
zero initialization specifically. This would leave us with 3 different APIs for
manipulating default bindings, but i guess it's not that bad because these
different APIs are quite good at expressing the coder's intent while hiding
internals of the `RegionStore`, and their implementations are significantly
different.
----
Now, recall that there was a previous attempt to relax that assertion in
https://reviews.llvm.org/rC190530, for the same reason. The separation of APIs
proposed here would allow us to restore the historical assertion in its
original strength, while accepting that the assertion doesn't cover C++
zero-initialization. I'm not seeing a good way to cover the zero-initialization
case with that, and i'm not convinced that it's necessary.
The interesting part here is, however, the special-case behavior that was
chosen on the path on which the assertion did not hold. It was pointed out by
Jordan that if a zeroing constructor follows a conservatively evaluated
empty-base constructor, it is more correct to preserve the default conjured
symbol produced by the empty-base constructor than to replace it with a zero
default binding that would completely overwrite the conjured symbol, since //we
don't model store binding extents//. This is more correct because
zero-initializer should only cover its respective sub-object, but other parts
of the big object must remain invalidated.
In the newly added tests i point out that keeping the original conjured symbol
in that case would also not be fully safe. The actually-safe behavior would be
to invalidate the object again, because it's no longer the same unknown value
that we previously had. So i'm adding a bunch of tests in which we take various
field values in different weird moments of time and see if we aren't
accidentally assuming that they're the same known or unknown values as the ones
taken in different moments of time.
These tests now also demonstrate another bug that would also make our store
bindings incorrectly, namely by looking at `ASTRecordLayout` it is not always
apparent what part of the object is going to be overwritten by making a binding
to a specific sub-region. See the new architecture-specific tests in which
field `z` misbehaves.
Long story short, `RegionStore` is modeling zero initializations imperfectly
for at least two reasons.
We could work around that by invalidating object contents every time we detect
something fishy. The only reasonable detection i can come up with would be to
detect if we have a coinciding binding (i.e. what the assertion was previously
protecting us from), eg.:
if (B.getDefaultBinding(R) || B.getDirectBinding(R))
V = UnknownVal();
This detection is imperfect - there are tests that would still fail. In
particular it won't fix the `ASTRecordLayout` problem. And i also suspect that
it really destroys a lot more good data than bad data. So i'll be experimenting
with that now. In the current version of the patch i remove the special case
handling and simply bind the default value blindly even if i know that
`RegionStore` can't handle it precisely.
Repository:
rC Clang
https://reviews.llvm.org/D46368
Files:
include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
include/clang/StaticAnalyzer/Core/PathSensitive/Store.h
lib/StaticAnalyzer/Checkers/MallocChecker.cpp
lib/StaticAnalyzer/Core/ExprEngine.cpp
lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
lib/StaticAnalyzer/Core/ProgramState.cpp
lib/StaticAnalyzer/Core/RegionStore.cpp
lib/StaticAnalyzer/Core/Store.cpp
test/Analysis/ctor.mm
Index: test/Analysis/ctor.mm
===================================================================
--- test/Analysis/ctor.mm
+++ test/Analysis/ctor.mm
@@ -1,5 +1,7 @@
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -fobjc-arc -analyzer-config c++-inlining=constructors -Wno-null-dereference -std=c++11 -verify %s
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -fobjc-arc -analyzer-config c++-inlining=constructors -Wno-null-dereference -std=c++11 -verify -DTEST_INLINABLE_ALLOCATORS %s
+// RUN: %clang_analyze_cc1 -triple i386-apple-darwin10 -DI386 -analyzer-checker=core,debug.ExprInspection -fobjc-arc -analyzer-config c++-inlining=constructors -Wno-null-dereference -std=c++11 -verify %s
+// RUN: %clang_analyze_cc1 -triple i386-apple-darwin10 -DI386 -analyzer-checker=core,debug.ExprInspection -fobjc-arc -analyzer-config c++-inlining=constructors -Wno-null-dereference -std=c++11 -verify -DTEST_INLINABLE_ALLOCATORS %s
+// RUN: %clang_analyze_cc1 -triple x86_64-apple-darwin12 -analyzer-checker=core,debug.ExprInspection -fobjc-arc -analyzer-config c++-inlining=constructors -Wno-null-dereference -std=c++11 -verify %s
+// RUN: %clang_analyze_cc1 -triple x86_64-apple-darwin12 -analyzer-checker=core,debug.ExprInspection -fobjc-arc -analyzer-config c++-inlining=constructors -Wno-null-dereference -std=c++11 -verify -DTEST_INLINABLE_ALLOCATORS %s
#include "Inputs/system-header-simulator-cxx.h"
@@ -638,21 +640,68 @@
class Empty {
public:
- Empty();
+ static int glob;
+ Empty(); // No body.
+ Empty(int x); // Body below.
};
class PairContainer : public Empty {
- raw_pair p;
public:
+ raw_pair p;
+ int q;
PairContainer() : Empty(), p() {
// This previously caused a crash because the empty base class looked
// like an initialization of 'p'.
}
PairContainer(int) : Empty(), p() {
// Test inlining something else here.
}
+ PairContainer(double): Empty(1), p() {
+ clang_analyzer_eval(p.p1 == 0); // expected-warning{{TRUE}}
+ clang_analyzer_eval(p.p2 == 0); // expected-warning{{TRUE}}
+
+ clang_analyzer_eval(q == 1); // expected-warning{{TRUE}}
+
+ // This one's indeed UNKNOWN. Definitely not TRUE.
+ clang_analyzer_eval(p.p2 == glob); // expected-warning{{UNKNOWN}}
+ }
};
+ Empty::Empty(int x) {
+ static_cast<PairContainer *>(this)->p.p1 = x;
+ static_cast<PairContainer *>(this)->q = x;
+ // Our static member will store the old garbage values of fields that aren't
+ // yet initialized. It's not certainly garbage though (i.e. the constructor
+ // could have been called on an initialized piece of memory), so no
+ // uninitialized value warning here, and it should be a symbol, not
+ // undefined value, for later comparison.
+ glob = static_cast<PairContainer *>(this)->p.p2;
+ }
+
+ class Empty2 {
+ public:
+ static int glob_p1, glob_p2;
+ Empty2(); // Body below.
+ };
+
+ class PairDoubleEmptyContainer: public Empty, public Empty2 {
+ public:
+ raw_pair p;
+ PairDoubleEmptyContainer(): Empty(), Empty2(), p() {
+ clang_analyzer_eval(p.p1 == 0); // expected-warning{{TRUE}}
+ clang_analyzer_eval(p.p2 == 0); // expected-warning{{TRUE}}
+
+ // This is indeed UNKNOWN.
+ clang_analyzer_eval(p.p1 == glob_p1); // expected-warning{{UNKNOWN}}
+ clang_analyzer_eval(p.p2 == glob_p2); // expected-warning{{UNKNOWN}}
+ }
+ };
+
+ Empty2::Empty2() {
+ glob_p1 = static_cast<PairDoubleEmptyContainer *>(this)->p.p1;
+ glob_p2 = static_cast<PairDoubleEmptyContainer *>(this)->p.p2;
+ }
+
class PairContainerContainer {
int padding;
PairContainer pc;
@@ -749,3 +798,67 @@
clang_analyzer_eval(d2.x == 1); // expected-warning{{TRUE}}
}
}
+
+namespace vbase_zero_init {
+class A {
+ virtual void foo();
+};
+
+class B {
+ virtual void bar();
+public:
+ static int glob_y, glob_z, glob_w;
+ int x;
+ B(); // Body below.
+};
+
+class C : virtual public A {
+public:
+ int y;
+};
+
+class D : public B, public C {
+public:
+ // 'z', unlike 'w', resides in an area that would have been within padding of
+ // base class 'C' if it wasn't part of 'D', but only on 64-bit systems.
+ int z, w;
+ // Initialization order: A(), B(), C().
+ D() : A(), C() {
+ clang_analyzer_eval(x == 1); // expected-warning{{TRUE}}
+ clang_analyzer_eval(y == 0); // expected-warning{{TRUE}}
+#ifdef I386
+ clang_analyzer_eval(z == 3); // expected-warning{{TRUE}}
+#else
+ // FIXME: Should be TRUE. Initialized in B().
+ clang_analyzer_eval(z == 3); // expected-warning{{UNKNOWN}}
+#endif
+ clang_analyzer_eval(w == 4); // expected-warning{{TRUE}}
+
+ // FIXME: Should be UNKNOWN. Changed in B() since glob_y was assigned.
+ clang_analyzer_eval(y == glob_y); // expected-warning{{TRUE}}
+
+#ifdef I386
+ clang_analyzer_eval(z == glob_z); // expected-warning{{UNKNOWN}}
+#else
+ // FIXME: Should be UNKNOWN. Changed in B() since glob_z was assigned.
+ clang_analyzer_eval(z == glob_z); // expected-warning{{TRUE}}
+#endif
+
+ clang_analyzer_eval(w == glob_w); // expected-warning{{UNKNOWN}}
+ } // no-crash
+};
+
+B::B() : x(1) {
+ // Our static members will store the old garbage values of fields that aren't
+ // yet initialized. These aren't certainly garbage though (i.e. the
+ // constructor could have been called on an initialized piece of memory),
+ // so no uninitialized value warning here, and these should be symbols, not
+ // undefined values, for later comparison.
+ glob_y = static_cast<D *>(this)->y;
+ glob_z = static_cast<D *>(this)->z;
+ glob_w = static_cast<D *>(this)->w;
+ static_cast<D *>(this)->y = 2;
+ static_cast<D *>(this)->z = 3;
+ static_cast<D *>(this)->w = 4;
+}
+}
Index: lib/StaticAnalyzer/Core/Store.cpp
===================================================================
--- lib/StaticAnalyzer/Core/Store.cpp
+++ lib/StaticAnalyzer/Core/Store.cpp
@@ -65,10 +65,6 @@
return MRMgr.getElementRegion(EleTy, idx, Base, svalBuilder.getContext());
}
-StoreRef StoreManager::BindDefault(Store store, const MemRegion *R, SVal V) {
- return StoreRef(store, *this);
-}
-
const ElementRegion *StoreManager::GetElementZeroRegion(const SubRegion *R,
QualType T) {
NonLoc idx = svalBuilder.makeZeroArrayIndex();
Index: lib/StaticAnalyzer/Core/RegionStore.cpp
===================================================================
--- lib/StaticAnalyzer/Core/RegionStore.cpp
+++ lib/StaticAnalyzer/Core/RegionStore.cpp
@@ -409,8 +409,22 @@
RegionBindingsRef bind(RegionBindingsConstRef B, Loc LV, SVal V);
- // BindDefault is only used to initialize a region with a default value.
- StoreRef BindDefault(Store store, const MemRegion *R, SVal V) override {
+ // BindDefaultInitial is only used to initialize a region with
+ // a default value.
+ StoreRef BindDefaultInitial(Store store, const MemRegion *R,
+ SVal V) override {
+ RegionBindingsRef B = getRegionBindings(store);
+ // Use other APIs when you have to wipe the region that was initialized
+ // earlier.
+ assert(!(B.getDefaultBinding(R) || B.getDirectBinding(R)) &&
+ "Double initialization!");
+ B = B.addBinding(BindingKey::Make(R, BindingKey::Default), V);
+ return StoreRef(B.asImmutableMap().getRootWithoutRetain(), *this);
+ }
+
+ // BindDefaultZero is used for zeroing constructors that may accidentally
+ // overwrite existing bindings.
+ StoreRef BindDefaultZero(Store store, const MemRegion *R) override {
// FIXME: The offsets of empty bases can be tricky because of
// of the so called "empty base class optimization".
// If a base class has been optimized out
@@ -420,24 +434,14 @@
// and trying to infer them from offsets/alignments
// seems to be error-prone and non-trivial because of the trailing padding.
// As a temporary mitigation we don't create bindings for empty bases.
- if (R->getKind() == MemRegion::CXXBaseObjectRegionKind &&
- cast<CXXBaseObjectRegion>(R)->getDecl()->isEmpty())
- return StoreRef(store, *this);
+ if (const auto *BR = dyn_cast<CXXBaseObjectRegion>(R))
+ if (BR->getDecl()->isEmpty())
+ return StoreRef(store, *this);
RegionBindingsRef B = getRegionBindings(store);
- assert(!B.lookup(R, BindingKey::Direct));
-
- BindingKey Key = BindingKey::Make(R, BindingKey::Default);
- if (B.lookup(Key)) {
- const SubRegion *SR = cast<SubRegion>(R);
- assert(SR->getAsOffset().getOffset() ==
- SR->getSuperRegion()->getAsOffset().getOffset() &&
- "A default value must come from a super-region");
- B = removeSubRegionBindings(B, SR);
- } else {
- B = B.addBinding(Key, V);
- }
-
+ SVal V = svalBuilder.makeZeroVal(Ctx.CharTy);
+ B = removeSubRegionBindings(B, cast<SubRegion>(R));
+ B = B.addBinding(BindingKey::Make(R, BindingKey::Default), V);
return StoreRef(B.asImmutableMap().getRootWithoutRetain(), *this);
}
Index: lib/StaticAnalyzer/Core/ProgramState.cpp
===================================================================
--- lib/StaticAnalyzer/Core/ProgramState.cpp
+++ lib/StaticAnalyzer/Core/ProgramState.cpp
@@ -126,16 +126,27 @@
return newState;
}
-ProgramStateRef ProgramState::bindDefault(SVal loc,
- SVal V,
- const LocationContext *LCtx) const {
+ProgramStateRef
+ProgramState::bindDefaultInitial(SVal loc, SVal V,
+ const LocationContext *LCtx) const {
+ ProgramStateManager &Mgr = getStateManager();
+ const MemRegion *R = loc.castAs<loc::MemRegionVal>().getRegion();
+ const StoreRef &newStore = Mgr.StoreMgr->BindDefaultInitial(getStore(), R, V);
+ ProgramStateRef new_state = makeWithStore(newStore);
+ return Mgr.getOwningEngine()
+ ? Mgr.getOwningEngine()->processRegionChange(new_state, R, LCtx)
+ : new_state;
+}
+
+ProgramStateRef
+ProgramState::bindDefaultZero(SVal loc, const LocationContext *LCtx) const {
ProgramStateManager &Mgr = getStateManager();
const MemRegion *R = loc.castAs<loc::MemRegionVal>().getRegion();
- const StoreRef &newStore = Mgr.StoreMgr->BindDefault(getStore(), R, V);
+ const StoreRef &newStore = Mgr.StoreMgr->BindDefaultZero(getStore(), R);
ProgramStateRef new_state = makeWithStore(newStore);
- return Mgr.getOwningEngine() ?
- Mgr.getOwningEngine()->processRegionChange(new_state, R, LCtx) :
- new_state;
+ return Mgr.getOwningEngine()
+ ? Mgr.getOwningEngine()->processRegionChange(new_state, R, LCtx)
+ : new_state;
}
typedef ArrayRef<const MemRegion *> RegionList;
Index: lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
===================================================================
--- lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
+++ lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
@@ -375,9 +375,6 @@
I != E; ++I) {
ProgramStateRef State = (*I)->getState();
if (CE->requiresZeroInitialization()) {
- // Type of the zero doesn't matter.
- SVal ZeroVal = svalBuilder.makeZeroVal(getContext().CharTy);
-
// FIXME: Once we properly handle constructors in new-expressions, we'll
// need to invalidate the region before setting a default value, to make
// sure there aren't any lingering bindings around. This probably needs
@@ -390,7 +387,7 @@
// actually make things worse. Placement new makes this tricky as well,
// since it's then possible to be initializing one part of a multi-
// dimensional array.
- State = State->bindDefault(loc::MemRegionVal(Target), ZeroVal, LCtx);
+ State = State->bindDefaultZero(loc::MemRegionVal(Target), LCtx);
}
State = addAllNecessaryTemporaryInfo(State, CC, LCtx, Target);
Index: lib/StaticAnalyzer/Core/ExprEngine.cpp
===================================================================
--- lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -348,7 +348,9 @@
break;
case SubobjectAdjustment::MemberPointerAdjustment:
// FIXME: Unimplemented.
- State = State->bindDefault(Reg, UnknownVal(), LC);
+ State = State->invalidateRegions(Reg, InitWithAdjustments,
+ currBldrCtx->blockCount(), LC, true,
+ nullptr, nullptr, nullptr);
return State;
}
}
Index: lib/StaticAnalyzer/Checkers/MallocChecker.cpp
===================================================================
--- lib/StaticAnalyzer/Checkers/MallocChecker.cpp
+++ lib/StaticAnalyzer/Checkers/MallocChecker.cpp
@@ -1273,7 +1273,7 @@
State = State->BindExpr(CE, C.getLocationContext(), RetVal);
// Fill the region with the initialization value.
- State = State->bindDefault(RetVal, Init, LCtx);
+ State = State->bindDefaultInitial(RetVal, Init, LCtx);
// Set the region's extent equal to the Size parameter.
const SymbolicRegion *R =
Index: include/clang/StaticAnalyzer/Core/PathSensitive/Store.h
===================================================================
--- include/clang/StaticAnalyzer/Core/PathSensitive/Store.h
+++ include/clang/StaticAnalyzer/Core/PathSensitive/Store.h
@@ -107,7 +107,15 @@
/// by \c val bound to the location given for \c loc.
virtual StoreRef Bind(Store store, Loc loc, SVal val) = 0;
- virtual StoreRef BindDefault(Store store, const MemRegion *R, SVal V);
+ /// Return a store with the specified value bound to all sub-regions of the
+ /// region. The region must not have previous bindings. If you need to
+ /// invalidate existing bindings, consider invalidateRegions().
+ virtual StoreRef BindDefaultInitial(Store store, const MemRegion *R,
+ SVal V) = 0;
+
+ /// Return a store with in which all values within the given region are
+ /// reset to zero. This method is allowed to overwrite previous bindings.
+ virtual StoreRef BindDefaultZero(Store store, const MemRegion *R) = 0;
/// \brief Create a new store with the specified binding removed.
/// \param ST the original store, that is the basis for the new store.
Index: include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
===================================================================
--- include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
+++ include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h
@@ -242,7 +242,18 @@
ProgramStateRef bindLoc(SVal location, SVal V, const LocationContext *LCtx) const;
- ProgramStateRef bindDefault(SVal loc, SVal V, const LocationContext *LCtx) const;
+ /// Initializes the region of memory represented by \p loc with an initial
+ /// value. Once initialized, all values loaded from any sub-regions of that
+ /// region will be equal to \p V, unless overwritten later by the program.
+ /// This method should not be used on regions that are already initialized.
+ /// If you need to indicate that memory contents have suddenly become unknown
+ /// within a certain region of memory, consider invalidateRegions().
+ ProgramStateRef bindDefaultInitial(SVal loc, SVal V,
+ const LocationContext *LCtx) const;
+
+ /// Performs C++ zero-initialization procedure on the region of memory
+ /// represented by \p loc.
+ ProgramStateRef bindDefaultZero(SVal loc, const LocationContext *LCtx) const;
ProgramStateRef killBinding(Loc LV) const;
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits