NoQ created this revision. NoQ added reviewers: dcoughlin, xazax.hun, a.sidorin, george.karpenkov, szepet, rnkovacs. Herald added subscribers: cfe-commits, baloghadamsoftware.
As the old assertion says, > "Thou shalt not initialize the same memory twice." Namely, the `State->bindDefault()`/`StoreManager.BindDefault()` API has the purpose of overriding the initial value with which memory is initialized. For instance, a checker can express that loading from freshly malloc()ed memory would yield Undefined, while loading from freshly calloc()ed memory would yield zero, even though loading from a normal symbolic region would yield a region-value symbol. `BindDefault` never was meant for binding arbitrary default values to arbitrary locations (though it was never doxygen'ed anywhere). For that reason it has the aforementioned assertion. The assertion is actually a bit more relaxed than that; it simply says that the new default binding should never overwrite an existing binding key, which means that not only the base region but also the offset should coincide in order to trigger the assertion. Other sorts of default store bindings are created by different APIs. Namely, there's also the invalidation case, which consists in default-binding a conjured symbol, and it is performed by `State->invalidateRegions()`/`StoreManager.invalidateRegions()` which never calls `BindDefault` but instead manipulates bindings directly. Now, when it comes to C++ zeroing constructors (i.e. non-user-supplied constructors that boil down to zero initialization of the object), a lot of weird things can happen, especially when such constructors are applied to fields or base objects, especially with multiple inheritance, especially with default inheritance from empty objects (objects with no fields, eg. a-la java interfaces). The same base object would undergo multiple initializations for every zeroing constructor of a field or a base. And due to empty fields or bases, it may happen that such bindings will have the exact same offset. It may also happen that the new default binding would conflict an old direct binding, as i demonstrated on one of the newly added tests. It seems that C++ zero-initialization is quite different from normal `BindDefault`'s idea of setting the initial/default value. Therefore i propose to split up the `BindDefault` API into two parts: `BindDefaultInitial()` for setting initial/default values and `BindDefaultZero()` for performing C++-like zero initialization specifically. This would leave us with 3 different APIs for manipulating default bindings, but i guess it's not that bad because these different APIs are quite good at expressing the coder's intent while hiding internals of the `RegionStore`, and their implementations are significantly different. ---- Now, recall that there was a previous attempt to relax that assertion in https://reviews.llvm.org/rC190530, for the same reason. The separation of APIs proposed here would allow us to restore the historical assertion in its original strength, while accepting that the assertion doesn't cover C++ zero-initialization. I'm not seeing a good way to cover the zero-initialization case with that, and i'm not convinced that it's necessary. The interesting part here is, however, the special-case behavior that was chosen on the path on which the assertion did not hold. It was pointed out by Jordan that if a zeroing constructor follows a conservatively evaluated empty-base constructor, it is more correct to preserve the default conjured symbol produced by the empty-base constructor than to replace it with a zero default binding that would completely overwrite the conjured symbol, since //we don't model store binding extents//. This is more correct because zero-initializer should only cover its respective sub-object, but other parts of the big object must remain invalidated. In the newly added tests i point out that keeping the original conjured symbol in that case would also not be fully safe. The actually-safe behavior would be to invalidate the object again, because it's no longer the same unknown value that we previously had. So i'm adding a bunch of tests in which we take various field values in different weird moments of time and see if we aren't accidentally assuming that they're the same known or unknown values as the ones taken in different moments of time. These tests now also demonstrate another bug that would also make our store bindings incorrectly, namely by looking at `ASTRecordLayout` it is not always apparent what part of the object is going to be overwritten by making a binding to a specific sub-region. See the new architecture-specific tests in which field `z` misbehaves. Long story short, `RegionStore` is modeling zero initializations imperfectly for at least two reasons. We could work around that by invalidating object contents every time we detect something fishy. The only reasonable detection i can come up with would be to detect if we have a coinciding binding (i.e. what the assertion was previously protecting us from), eg.: if (B.getDefaultBinding(R) || B.getDirectBinding(R)) V = UnknownVal(); This detection is imperfect - there are tests that would still fail. In particular it won't fix the `ASTRecordLayout` problem. And i also suspect that it really destroys a lot more good data than bad data. So i'll be experimenting with that now. In the current version of the patch i remove the special case handling and simply bind the default value blindly even if i know that `RegionStore` can't handle it precisely. Repository: rC Clang https://reviews.llvm.org/D46368 Files: include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h include/clang/StaticAnalyzer/Core/PathSensitive/Store.h lib/StaticAnalyzer/Checkers/MallocChecker.cpp lib/StaticAnalyzer/Core/ExprEngine.cpp lib/StaticAnalyzer/Core/ExprEngineCXX.cpp lib/StaticAnalyzer/Core/ProgramState.cpp lib/StaticAnalyzer/Core/RegionStore.cpp lib/StaticAnalyzer/Core/Store.cpp test/Analysis/ctor.mm
Index: test/Analysis/ctor.mm =================================================================== --- test/Analysis/ctor.mm +++ test/Analysis/ctor.mm @@ -1,5 +1,7 @@ -// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -fobjc-arc -analyzer-config c++-inlining=constructors -Wno-null-dereference -std=c++11 -verify %s -// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -fobjc-arc -analyzer-config c++-inlining=constructors -Wno-null-dereference -std=c++11 -verify -DTEST_INLINABLE_ALLOCATORS %s +// RUN: %clang_analyze_cc1 -triple i386-apple-darwin10 -DI386 -analyzer-checker=core,debug.ExprInspection -fobjc-arc -analyzer-config c++-inlining=constructors -Wno-null-dereference -std=c++11 -verify %s +// RUN: %clang_analyze_cc1 -triple i386-apple-darwin10 -DI386 -analyzer-checker=core,debug.ExprInspection -fobjc-arc -analyzer-config c++-inlining=constructors -Wno-null-dereference -std=c++11 -verify -DTEST_INLINABLE_ALLOCATORS %s +// RUN: %clang_analyze_cc1 -triple x86_64-apple-darwin12 -analyzer-checker=core,debug.ExprInspection -fobjc-arc -analyzer-config c++-inlining=constructors -Wno-null-dereference -std=c++11 -verify %s +// RUN: %clang_analyze_cc1 -triple x86_64-apple-darwin12 -analyzer-checker=core,debug.ExprInspection -fobjc-arc -analyzer-config c++-inlining=constructors -Wno-null-dereference -std=c++11 -verify -DTEST_INLINABLE_ALLOCATORS %s #include "Inputs/system-header-simulator-cxx.h" @@ -638,21 +640,68 @@ class Empty { public: - Empty(); + static int glob; + Empty(); // No body. + Empty(int x); // Body below. }; class PairContainer : public Empty { - raw_pair p; public: + raw_pair p; + int q; PairContainer() : Empty(), p() { // This previously caused a crash because the empty base class looked // like an initialization of 'p'. } PairContainer(int) : Empty(), p() { // Test inlining something else here. } + PairContainer(double): Empty(1), p() { + clang_analyzer_eval(p.p1 == 0); // expected-warning{{TRUE}} + clang_analyzer_eval(p.p2 == 0); // expected-warning{{TRUE}} + + clang_analyzer_eval(q == 1); // expected-warning{{TRUE}} + + // This one's indeed UNKNOWN. Definitely not TRUE. + clang_analyzer_eval(p.p2 == glob); // expected-warning{{UNKNOWN}} + } }; + Empty::Empty(int x) { + static_cast<PairContainer *>(this)->p.p1 = x; + static_cast<PairContainer *>(this)->q = x; + // Our static member will store the old garbage values of fields that aren't + // yet initialized. It's not certainly garbage though (i.e. the constructor + // could have been called on an initialized piece of memory), so no + // uninitialized value warning here, and it should be a symbol, not + // undefined value, for later comparison. + glob = static_cast<PairContainer *>(this)->p.p2; + } + + class Empty2 { + public: + static int glob_p1, glob_p2; + Empty2(); // Body below. + }; + + class PairDoubleEmptyContainer: public Empty, public Empty2 { + public: + raw_pair p; + PairDoubleEmptyContainer(): Empty(), Empty2(), p() { + clang_analyzer_eval(p.p1 == 0); // expected-warning{{TRUE}} + clang_analyzer_eval(p.p2 == 0); // expected-warning{{TRUE}} + + // This is indeed UNKNOWN. + clang_analyzer_eval(p.p1 == glob_p1); // expected-warning{{UNKNOWN}} + clang_analyzer_eval(p.p2 == glob_p2); // expected-warning{{UNKNOWN}} + } + }; + + Empty2::Empty2() { + glob_p1 = static_cast<PairDoubleEmptyContainer *>(this)->p.p1; + glob_p2 = static_cast<PairDoubleEmptyContainer *>(this)->p.p2; + } + class PairContainerContainer { int padding; PairContainer pc; @@ -749,3 +798,67 @@ clang_analyzer_eval(d2.x == 1); // expected-warning{{TRUE}} } } + +namespace vbase_zero_init { +class A { + virtual void foo(); +}; + +class B { + virtual void bar(); +public: + static int glob_y, glob_z, glob_w; + int x; + B(); // Body below. +}; + +class C : virtual public A { +public: + int y; +}; + +class D : public B, public C { +public: + // 'z', unlike 'w', resides in an area that would have been within padding of + // base class 'C' if it wasn't part of 'D', but only on 64-bit systems. + int z, w; + // Initialization order: A(), B(), C(). + D() : A(), C() { + clang_analyzer_eval(x == 1); // expected-warning{{TRUE}} + clang_analyzer_eval(y == 0); // expected-warning{{TRUE}} +#ifdef I386 + clang_analyzer_eval(z == 3); // expected-warning{{TRUE}} +#else + // FIXME: Should be TRUE. Initialized in B(). + clang_analyzer_eval(z == 3); // expected-warning{{UNKNOWN}} +#endif + clang_analyzer_eval(w == 4); // expected-warning{{TRUE}} + + // FIXME: Should be UNKNOWN. Changed in B() since glob_y was assigned. + clang_analyzer_eval(y == glob_y); // expected-warning{{TRUE}} + +#ifdef I386 + clang_analyzer_eval(z == glob_z); // expected-warning{{UNKNOWN}} +#else + // FIXME: Should be UNKNOWN. Changed in B() since glob_z was assigned. + clang_analyzer_eval(z == glob_z); // expected-warning{{TRUE}} +#endif + + clang_analyzer_eval(w == glob_w); // expected-warning{{UNKNOWN}} + } // no-crash +}; + +B::B() : x(1) { + // Our static members will store the old garbage values of fields that aren't + // yet initialized. These aren't certainly garbage though (i.e. the + // constructor could have been called on an initialized piece of memory), + // so no uninitialized value warning here, and these should be symbols, not + // undefined values, for later comparison. + glob_y = static_cast<D *>(this)->y; + glob_z = static_cast<D *>(this)->z; + glob_w = static_cast<D *>(this)->w; + static_cast<D *>(this)->y = 2; + static_cast<D *>(this)->z = 3; + static_cast<D *>(this)->w = 4; +} +} Index: lib/StaticAnalyzer/Core/Store.cpp =================================================================== --- lib/StaticAnalyzer/Core/Store.cpp +++ lib/StaticAnalyzer/Core/Store.cpp @@ -65,10 +65,6 @@ return MRMgr.getElementRegion(EleTy, idx, Base, svalBuilder.getContext()); } -StoreRef StoreManager::BindDefault(Store store, const MemRegion *R, SVal V) { - return StoreRef(store, *this); -} - const ElementRegion *StoreManager::GetElementZeroRegion(const SubRegion *R, QualType T) { NonLoc idx = svalBuilder.makeZeroArrayIndex(); Index: lib/StaticAnalyzer/Core/RegionStore.cpp =================================================================== --- lib/StaticAnalyzer/Core/RegionStore.cpp +++ lib/StaticAnalyzer/Core/RegionStore.cpp @@ -409,8 +409,22 @@ RegionBindingsRef bind(RegionBindingsConstRef B, Loc LV, SVal V); - // BindDefault is only used to initialize a region with a default value. - StoreRef BindDefault(Store store, const MemRegion *R, SVal V) override { + // BindDefaultInitial is only used to initialize a region with + // a default value. + StoreRef BindDefaultInitial(Store store, const MemRegion *R, + SVal V) override { + RegionBindingsRef B = getRegionBindings(store); + // Use other APIs when you have to wipe the region that was initialized + // earlier. + assert(!(B.getDefaultBinding(R) || B.getDirectBinding(R)) && + "Double initialization!"); + B = B.addBinding(BindingKey::Make(R, BindingKey::Default), V); + return StoreRef(B.asImmutableMap().getRootWithoutRetain(), *this); + } + + // BindDefaultZero is used for zeroing constructors that may accidentally + // overwrite existing bindings. + StoreRef BindDefaultZero(Store store, const MemRegion *R) override { // FIXME: The offsets of empty bases can be tricky because of // of the so called "empty base class optimization". // If a base class has been optimized out @@ -420,24 +434,14 @@ // and trying to infer them from offsets/alignments // seems to be error-prone and non-trivial because of the trailing padding. // As a temporary mitigation we don't create bindings for empty bases. - if (R->getKind() == MemRegion::CXXBaseObjectRegionKind && - cast<CXXBaseObjectRegion>(R)->getDecl()->isEmpty()) - return StoreRef(store, *this); + if (const auto *BR = dyn_cast<CXXBaseObjectRegion>(R)) + if (BR->getDecl()->isEmpty()) + return StoreRef(store, *this); RegionBindingsRef B = getRegionBindings(store); - assert(!B.lookup(R, BindingKey::Direct)); - - BindingKey Key = BindingKey::Make(R, BindingKey::Default); - if (B.lookup(Key)) { - const SubRegion *SR = cast<SubRegion>(R); - assert(SR->getAsOffset().getOffset() == - SR->getSuperRegion()->getAsOffset().getOffset() && - "A default value must come from a super-region"); - B = removeSubRegionBindings(B, SR); - } else { - B = B.addBinding(Key, V); - } - + SVal V = svalBuilder.makeZeroVal(Ctx.CharTy); + B = removeSubRegionBindings(B, cast<SubRegion>(R)); + B = B.addBinding(BindingKey::Make(R, BindingKey::Default), V); return StoreRef(B.asImmutableMap().getRootWithoutRetain(), *this); } Index: lib/StaticAnalyzer/Core/ProgramState.cpp =================================================================== --- lib/StaticAnalyzer/Core/ProgramState.cpp +++ lib/StaticAnalyzer/Core/ProgramState.cpp @@ -126,16 +126,27 @@ return newState; } -ProgramStateRef ProgramState::bindDefault(SVal loc, - SVal V, - const LocationContext *LCtx) const { +ProgramStateRef +ProgramState::bindDefaultInitial(SVal loc, SVal V, + const LocationContext *LCtx) const { + ProgramStateManager &Mgr = getStateManager(); + const MemRegion *R = loc.castAs<loc::MemRegionVal>().getRegion(); + const StoreRef &newStore = Mgr.StoreMgr->BindDefaultInitial(getStore(), R, V); + ProgramStateRef new_state = makeWithStore(newStore); + return Mgr.getOwningEngine() + ? Mgr.getOwningEngine()->processRegionChange(new_state, R, LCtx) + : new_state; +} + +ProgramStateRef +ProgramState::bindDefaultZero(SVal loc, const LocationContext *LCtx) const { ProgramStateManager &Mgr = getStateManager(); const MemRegion *R = loc.castAs<loc::MemRegionVal>().getRegion(); - const StoreRef &newStore = Mgr.StoreMgr->BindDefault(getStore(), R, V); + const StoreRef &newStore = Mgr.StoreMgr->BindDefaultZero(getStore(), R); ProgramStateRef new_state = makeWithStore(newStore); - return Mgr.getOwningEngine() ? - Mgr.getOwningEngine()->processRegionChange(new_state, R, LCtx) : - new_state; + return Mgr.getOwningEngine() + ? Mgr.getOwningEngine()->processRegionChange(new_state, R, LCtx) + : new_state; } typedef ArrayRef<const MemRegion *> RegionList; Index: lib/StaticAnalyzer/Core/ExprEngineCXX.cpp =================================================================== --- lib/StaticAnalyzer/Core/ExprEngineCXX.cpp +++ lib/StaticAnalyzer/Core/ExprEngineCXX.cpp @@ -375,9 +375,6 @@ I != E; ++I) { ProgramStateRef State = (*I)->getState(); if (CE->requiresZeroInitialization()) { - // Type of the zero doesn't matter. - SVal ZeroVal = svalBuilder.makeZeroVal(getContext().CharTy); - // FIXME: Once we properly handle constructors in new-expressions, we'll // need to invalidate the region before setting a default value, to make // sure there aren't any lingering bindings around. This probably needs @@ -390,7 +387,7 @@ // actually make things worse. Placement new makes this tricky as well, // since it's then possible to be initializing one part of a multi- // dimensional array. - State = State->bindDefault(loc::MemRegionVal(Target), ZeroVal, LCtx); + State = State->bindDefaultZero(loc::MemRegionVal(Target), LCtx); } State = addAllNecessaryTemporaryInfo(State, CC, LCtx, Target); Index: lib/StaticAnalyzer/Core/ExprEngine.cpp =================================================================== --- lib/StaticAnalyzer/Core/ExprEngine.cpp +++ lib/StaticAnalyzer/Core/ExprEngine.cpp @@ -348,7 +348,9 @@ break; case SubobjectAdjustment::MemberPointerAdjustment: // FIXME: Unimplemented. - State = State->bindDefault(Reg, UnknownVal(), LC); + State = State->invalidateRegions(Reg, InitWithAdjustments, + currBldrCtx->blockCount(), LC, true, + nullptr, nullptr, nullptr); return State; } } Index: lib/StaticAnalyzer/Checkers/MallocChecker.cpp =================================================================== --- lib/StaticAnalyzer/Checkers/MallocChecker.cpp +++ lib/StaticAnalyzer/Checkers/MallocChecker.cpp @@ -1273,7 +1273,7 @@ State = State->BindExpr(CE, C.getLocationContext(), RetVal); // Fill the region with the initialization value. - State = State->bindDefault(RetVal, Init, LCtx); + State = State->bindDefaultInitial(RetVal, Init, LCtx); // Set the region's extent equal to the Size parameter. const SymbolicRegion *R = Index: include/clang/StaticAnalyzer/Core/PathSensitive/Store.h =================================================================== --- include/clang/StaticAnalyzer/Core/PathSensitive/Store.h +++ include/clang/StaticAnalyzer/Core/PathSensitive/Store.h @@ -107,7 +107,15 @@ /// by \c val bound to the location given for \c loc. virtual StoreRef Bind(Store store, Loc loc, SVal val) = 0; - virtual StoreRef BindDefault(Store store, const MemRegion *R, SVal V); + /// Return a store with the specified value bound to all sub-regions of the + /// region. The region must not have previous bindings. If you need to + /// invalidate existing bindings, consider invalidateRegions(). + virtual StoreRef BindDefaultInitial(Store store, const MemRegion *R, + SVal V) = 0; + + /// Return a store with in which all values within the given region are + /// reset to zero. This method is allowed to overwrite previous bindings. + virtual StoreRef BindDefaultZero(Store store, const MemRegion *R) = 0; /// \brief Create a new store with the specified binding removed. /// \param ST the original store, that is the basis for the new store. Index: include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h =================================================================== --- include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h +++ include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h @@ -242,7 +242,18 @@ ProgramStateRef bindLoc(SVal location, SVal V, const LocationContext *LCtx) const; - ProgramStateRef bindDefault(SVal loc, SVal V, const LocationContext *LCtx) const; + /// Initializes the region of memory represented by \p loc with an initial + /// value. Once initialized, all values loaded from any sub-regions of that + /// region will be equal to \p V, unless overwritten later by the program. + /// This method should not be used on regions that are already initialized. + /// If you need to indicate that memory contents have suddenly become unknown + /// within a certain region of memory, consider invalidateRegions(). + ProgramStateRef bindDefaultInitial(SVal loc, SVal V, + const LocationContext *LCtx) const; + + /// Performs C++ zero-initialization procedure on the region of memory + /// represented by \p loc. + ProgramStateRef bindDefaultZero(SVal loc, const LocationContext *LCtx) const; ProgramStateRef killBinding(Loc LV) const;
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits