Szelethus created this revision. Szelethus added reviewers: NoQ, vsavchenko, dcoughlin, baloghadamsoftware, martong, balazske, xazax.hun, steakhal. Szelethus added a project: clang. Herald added subscribers: cfe-commits, ASDenysPetrov, Charusso, gamesh411, dkrupp, donat.nagy, mikhail.ramalho, a.sidorin, rnkovacs, szepet, whisperity. Szelethus requested review of this revision.
Based on the discussion in D82598#2171312 <https://reviews.llvm.org/D82598#2171312>. Thanks @NoQ! Let's talk about how we got here. D82598 <https://reviews.llvm.org/D82598> is titled "Get rid of statement liveness, because such a thing doesn't exist", and indeed, expressions //express// a value, non-expression statements don't. if (a && get() || []{ return true; }()) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ has a value ~ has a value ~~~~~~~~~~ has a value ~~~~~~~~~~~~~~~~~~~~ has a value ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ doesn't have a value That is simple enough, so it would only make sense if we only assigned symbolic values to expressions in the static analyzer. Yet the interface checkers can access presents, among other strange things, the following two methods: ProgramState::BindExpr(const Stmt *S, const LocationContext *LCtx, SVal V, bool Invalidate=true) <https://clang.llvm.org/doxygen/classclang_1_1ento_1_1ProgramState.html#a1c9e0f3d56ed5206fea62b1fa68c8e94> ProgramState::getSVal(const Stmt *S, const LocationContext *LCtx) <https://clang.llvm.org/doxygen/classclang_1_1ento_1_1ProgramState.html#aaa0ff169cc4a9f18f4dd626cf7201053> So, what gives? Turns out, we make an exception for `ReturnStmt` (which we'll leave for another time) and `ObjCForCollectionStmt`. For any other loops, in order to know whether we should analyze another iteration, among other things, we evaluate it's condition. Which is a problem for `ObjCForCollectionStmt`, because it simply doesn't have one (`CXXForRangeStmt` has <https://clang.llvm.org/doxygen/classclang_1_1CXXForRangeStmt.html#a06e5bf0adad48dddee1d99a9cd324cf4> an implicit one!). In its absence, we assigned the actual statement with a concrete 1 or 0 to indicate whether there are any more iterations left. However, this is wildly incorrect, its just simply not true that the `for` statement has a value of 1 or 0, we can't calculate its liveness because that doesn't make any sense either, so this patch turns it into a GDM trait. This patch isn't quite done (its nowhere near up to my standards in terms of documentation), but I wanted to publish it to avoid sinking too much work into a doomed patch. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D86736 Files: clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h clang/lib/StaticAnalyzer/Checkers/BasicObjCFoundationChecks.cpp clang/lib/StaticAnalyzer/Checkers/UndefBranchChecker.cpp clang/lib/StaticAnalyzer/Core/Environment.cpp clang/lib/StaticAnalyzer/Core/ExprEngine.cpp clang/lib/StaticAnalyzer/Core/ExprEngineObjC.cpp clang/lib/StaticAnalyzer/Core/ProgramState.cpp clang/lib/StaticAnalyzer/Core/SymbolManager.cpp
Index: clang/lib/StaticAnalyzer/Core/SymbolManager.cpp =================================================================== --- clang/lib/StaticAnalyzer/Core/SymbolManager.cpp +++ clang/lib/StaticAnalyzer/Core/SymbolManager.cpp @@ -14,6 +14,7 @@ #include "clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h" #include "clang/AST/ASTContext.h" #include "clang/AST/Expr.h" +#include "clang/AST/StmtObjC.h" #include "clang/Analysis/Analyses/LiveVariables.h" #include "clang/Analysis/AnalysisDeclContext.h" #include "clang/Basic/LLVM.h" @@ -494,7 +495,8 @@ return true; } - // If no statement is provided, everything is this and parent contexts is live. + // If no statement is provided, everything in this and parent contexts are + // live. if (!Loc) return true; Index: clang/lib/StaticAnalyzer/Core/ProgramState.cpp =================================================================== --- clang/lib/StaticAnalyzer/Core/ProgramState.cpp +++ clang/lib/StaticAnalyzer/Core/ProgramState.cpp @@ -11,6 +11,9 @@ //===----------------------------------------------------------------------===// #include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h" +#include "clang/AST/ASTDumperUtils.h" +#include "clang/AST/StmtObjC.h" +#include "clang/Analysis/AnalysisDeclContext.h" #include "clang/Analysis/CFG.h" #include "clang/Basic/JsonSupport.h" #include "clang/StaticAnalyzer/Core/PathSensitive/AnalysisManager.h" @@ -18,6 +21,7 @@ #include "clang/StaticAnalyzer/Core/PathSensitive/DynamicType.h" #include "clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h" #include "clang/StaticAnalyzer/Core/PathSensitive/ProgramStateTrait.h" +#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState_Fwd.h" #include "llvm/Support/raw_ostream.h" using namespace clang; @@ -41,7 +45,8 @@ Mgr.freeStates.push_back(s); } } -}} +} // namespace ento +} // namespace clang ProgramState::ProgramState(ProgramStateManager *mgr, const Environment& env, StoreRef st, GenericDataMap gdm) @@ -318,10 +323,36 @@ return getStateManager().getPersistentState(NewSt); } +using ObjCForLctxPair = + std::pair<const ObjCForCollectionStmt *, const LocationContext *>; + +REGISTER_MAP_WITH_PROGRAMSTATE(ObjCForHasMoreIterations, ObjCForLctxPair, bool) + +ProgramStateRef +ProgramState::setWhetherHasMoreIteration(const ObjCForCollectionStmt *O, + const LocationContext *LC, + bool HasMoreIteraton) const { + assert(!contains<ObjCForHasMoreIterations>({O, LC})); + return set<ObjCForHasMoreIterations>({O, LC}, HasMoreIteraton); +} + +ProgramStateRef +ProgramState::removeIterationState(const ObjCForCollectionStmt *O, + const LocationContext *LC) const { + assert(contains<ObjCForHasMoreIterations>({O, LC})); + return remove<ObjCForHasMoreIterations>({O, LC}); +} + +bool ProgramState::hasMoreIteration(const ObjCForCollectionStmt *O, + const LocationContext *LC) const { + assert(contains<ObjCForHasMoreIterations>({O, LC})); + return *get<ObjCForHasMoreIterations>({O, LC}); +} + ProgramStateRef ProgramState::assumeInBound(DefinedOrUnknownSVal Idx, - DefinedOrUnknownSVal UpperBound, - bool Assumption, - QualType indexTy) const { + DefinedOrUnknownSVal UpperBound, + bool Assumption, + QualType indexTy) const { if (Idx.isUnknown() || UpperBound.isUnknown()) return this; Index: clang/lib/StaticAnalyzer/Core/ExprEngineObjC.cpp =================================================================== --- clang/lib/StaticAnalyzer/Core/ExprEngineObjC.cpp +++ clang/lib/StaticAnalyzer/Core/ExprEngineObjC.cpp @@ -53,10 +53,8 @@ ProgramStateRef state = Pred->getState(); const LocationContext *LCtx = Pred->getLocationContext(); - SVal hasElementsV = svalBuilder.makeTruthVal(hasElements); - - // FIXME: S is not an expression. We should not be binding values to it. - ProgramStateRef nextState = state->BindExpr(S, LCtx, hasElementsV); + ProgramStateRef nextState = + state->setWhetherHasMoreIteration(S, LCtx, hasElements); if (auto MV = elementV.getAs<loc::MemRegionVal>()) if (const auto *R = dyn_cast<TypedValueRegion>(MV->getRegion())) { @@ -93,9 +91,10 @@ // (1) binds the next container value to 'element'. This creates a new // node in the ExplodedGraph. // + // FIXME: This is no longer true. // (2) binds the value 0/1 to the ObjCForCollectionStmt* itself, indicating // whether or not the container has any more elements. This value - // will be tested in ProcessBranch. We need to explicitly bind + // will be tested in processBranch. We need to explicitly bind // this value because a container can contain nil elements. // // FIXME: Eventually this logic should actually do dispatches to Index: clang/lib/StaticAnalyzer/Core/ExprEngine.cpp =================================================================== --- clang/lib/StaticAnalyzer/Core/ExprEngine.cpp +++ clang/lib/StaticAnalyzer/Core/ExprEngine.cpp @@ -2129,6 +2129,51 @@ llvm_unreachable("could not resolve condition"); } +static Optional<std::pair<ProgramStateRef, ProgramStateRef>> +assumeCondition(const Stmt *Condition, ExplodedNode *N) { + ProgramStateRef State = N->getState(); + if (const auto *ObjCFor = dyn_cast<ObjCForCollectionStmt>(Condition)) { + bool HasMoreIteraton = State->hasMoreIteration(ObjCFor, N->getLocationContext()); + // Checkers have already ran on branch conditions, so the current + // information as to whether the loop has more iteration becomes outdated + // after this point. + State = State->removeIterationState(ObjCFor, N->getLocationContext()); + if (HasMoreIteraton) + return std::pair<ProgramStateRef, ProgramStateRef>{State, nullptr}; + else + return std::pair<ProgramStateRef, ProgramStateRef>{nullptr, State}; + } + SVal X = State->getSVal(Condition, N->getLocationContext()); + + if (X.isUnknownOrUndef()) { + // Give it a chance to recover from unknown. + if (const auto *Ex = dyn_cast<Expr>(Condition)) { + if (Ex->getType()->isIntegralOrEnumerationType()) { + // Try to recover some path-sensitivity. Right now casts of symbolic + // integers that promote their values are currently not tracked well. + // If 'Condition' is such an expression, try and recover the + // underlying value and use that instead. + SVal recovered = + RecoverCastedSymbol(State, Condition, N->getLocationContext(), + N->getState()->getStateManager().getContext()); + + if (!recovered.isUnknown()) { + X = recovered; + } + } + } + } + + // If the condition is still unknown, give up. + if (X.isUnknownOrUndef()) + return None; + + DefinedSVal V = X.castAs<DefinedSVal>(); + + ProgramStateRef StTrue, StFalse; + return State->assume(V); +} + void ExprEngine::processBranch(const Stmt *Condition, NodeBuilderContext& BldCtx, ExplodedNode *Pred, @@ -2165,48 +2210,28 @@ return; BranchNodeBuilder builder(CheckersOutSet, Dst, BldCtx, DstT, DstF); - for (const auto PredI : CheckersOutSet) { - if (PredI->isSink()) + for (ExplodedNode *PredN : CheckersOutSet) { + if (PredN->isSink()) continue; - ProgramStateRef PrevState = PredI->getState(); - SVal X = PrevState->getSVal(Condition, PredI->getLocationContext()); - - if (X.isUnknownOrUndef()) { - // Give it a chance to recover from unknown. - if (const auto *Ex = dyn_cast<Expr>(Condition)) { - if (Ex->getType()->isIntegralOrEnumerationType()) { - // Try to recover some path-sensitivity. Right now casts of symbolic - // integers that promote their values are currently not tracked well. - // If 'Condition' is such an expression, try and recover the - // underlying value and use that instead. - SVal recovered = RecoverCastedSymbol(PrevState, Condition, - PredI->getLocationContext(), - getContext()); - - if (!recovered.isUnknown()) { - X = recovered; - } - } - } - } + ProgramStateRef PrevState = PredN->getState(); - // If the condition is still unknown, give up. - if (X.isUnknownOrUndef()) { - builder.generateNode(PrevState, true, PredI); - builder.generateNode(PrevState, false, PredI); + ProgramStateRef StTrue, StFalse; + if (const auto KnownCondValueAssumption = assumeCondition(Condition, PredN)) + std::tie(StTrue, StFalse) = *KnownCondValueAssumption; + else { + assert(!isa<ObjCForCollectionStmt>(Condition)); + builder.generateNode(PrevState, true, PredN); + builder.generateNode(PrevState, false, PredN); continue; } - - DefinedSVal V = X.castAs<DefinedSVal>(); - - ProgramStateRef StTrue, StFalse; - std::tie(StTrue, StFalse) = PrevState->assume(V); + if (StTrue && StFalse) + assert(!isa<ObjCForCollectionStmt>(Condition));; // Process the true branch. if (builder.isFeasible(true)) { if (StTrue) - builder.generateNode(StTrue, true, PredI); + builder.generateNode(StTrue, true, PredN); else builder.markInfeasible(true); } @@ -2214,7 +2239,7 @@ // Process the false branch. if (builder.isFeasible(false)) { if (StFalse) - builder.generateNode(StFalse, false, PredI); + builder.generateNode(StFalse, false, PredN); else builder.markInfeasible(false); } Index: clang/lib/StaticAnalyzer/Core/Environment.cpp =================================================================== --- clang/lib/StaticAnalyzer/Core/Environment.cpp +++ clang/lib/StaticAnalyzer/Core/Environment.cpp @@ -15,6 +15,7 @@ #include "clang/AST/ExprCXX.h" #include "clang/AST/PrettyPrinter.h" #include "clang/AST/Stmt.h" +#include "clang/AST/StmtObjC.h" #include "clang/Analysis/AnalysisDeclContext.h" #include "clang/Basic/LLVM.h" #include "clang/Basic/LangOptions.h" @@ -85,6 +86,8 @@ SVal Environment::getSVal(const EnvironmentEntry &Entry, SValBuilder& svalBuilder) const { const Stmt *S = Entry.getStmt(); + assert(!isa<ObjCForCollectionStmt>(S) && + "Use ProgramState::hasMoreIteration()!"); const LocationContext *LCtx = Entry.getLocationContext(); switch (S->getStmtClass()) { Index: clang/lib/StaticAnalyzer/Checkers/UndefBranchChecker.cpp =================================================================== --- clang/lib/StaticAnalyzer/Checkers/UndefBranchChecker.cpp +++ clang/lib/StaticAnalyzer/Checkers/UndefBranchChecker.cpp @@ -11,6 +11,8 @@ // //===----------------------------------------------------------------------===// +#include "clang/AST/StmtObjC.h" +#include "clang/AST/Type.h" #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h" #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h" #include "clang/StaticAnalyzer/Core/Checker.h" @@ -54,10 +56,13 @@ void checkBranchCondition(const Stmt *Condition, CheckerContext &Ctx) const; }; -} +} // namespace void UndefBranchChecker::checkBranchCondition(const Stmt *Condition, CheckerContext &Ctx) const { + // ObjCForCollection is a loop, but has no actual condition. + if (isa<ObjCForCollectionStmt>(Condition)) + return; SVal X = Ctx.getSVal(Condition); if (X.isUndef()) { // Generate a sink node, which implicitly marks both outgoing branches as Index: clang/lib/StaticAnalyzer/Checkers/BasicObjCFoundationChecks.cpp =================================================================== --- clang/lib/StaticAnalyzer/Checkers/BasicObjCFoundationChecks.cpp +++ clang/lib/StaticAnalyzer/Checkers/BasicObjCFoundationChecks.cpp @@ -978,8 +978,7 @@ ProgramStateRef State = C.getState(); // Check if this is the branch for the end of the loop. - SVal CollectionSentinel = C.getSVal(FCS); - if (CollectionSentinel.isZeroConstant()) { + if (!State->hasMoreIteration(FCS, C.getLocationContext())) { if (!alreadyExecutedAtLeastOneLoopIteration(C.getPredecessor(), FCS)) State = assumeCollectionNonEmpty(C, State, FCS, /*Assumption*/false); Index: clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h =================================================================== --- clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h +++ clang/include/clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h @@ -13,6 +13,7 @@ #ifndef LLVM_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_PROGRAMSTATE_H #define LLVM_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_PROGRAMSTATE_H +#include "clang/Analysis/AnalysisDeclContext.h" #include "clang/Basic/LLVM.h" #include "clang/StaticAnalyzer/Core/PathSensitive/ConstraintManager.h" #include "clang/StaticAnalyzer/Core/PathSensitive/DynamicTypeInfo.h" @@ -226,12 +227,22 @@ ConditionTruthVal areEqual(SVal Lhs, SVal Rhs) const; /// Utility method for getting regions. - const VarRegion* getRegion(const VarDecl *D, const LocationContext *LC) const; + const VarRegion *getRegion(const VarDecl *D, const LocationContext *LC) const; //==---------------------------------------------------------------------==// // Binding and retrieving values to/from the environment and symbolic store. //==---------------------------------------------------------------------==// + LLVM_NODISCARD ProgramStateRef setWhetherHasMoreIteration( + const ObjCForCollectionStmt *O, const LocationContext *LC, + bool HasMoreIteraton) const; + + LLVM_NODISCARD ProgramStateRef removeIterationState( + const ObjCForCollectionStmt *O, const LocationContext *LC) const; + + LLVM_NODISCARD bool hasMoreIteration(const ObjCForCollectionStmt *O, + const LocationContext *LC) const; + /// Create a new state by binding the value 'V' to the statement 'S' in the /// state's environment. LLVM_NODISCARD ProgramStateRef BindExpr(const Stmt *S,
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits