steakhal requested changes to this revision. steakhal added a comment. This revision now requires changes to proceed.
I didn't go through the whole revision this time, but I think the next steps are already clear for the next round. My impression was that I might not expressed my intent and expectations about the directions of the next step. I hope I managed this time. Let me know if you have questions. ================ Comment at: clang/include/clang/StaticAnalyzer/Checkers/Taint.h:82-103 +/// Returns the tainted Symbols for a given Statement and state. +std::vector<SymbolRef> getTaintedSymbols(ProgramStateRef State, const Stmt *S, + const LocationContext *LCtx, + TaintTagType Kind = TaintTagGeneric, + bool returnFirstOnly = false); + +/// Returns the tainted Symbols for a given SVal and state. ---------------- The overloads having the extra `ReturnFirstOnly` parameter shouldn't be visible here in the header. That is an implementation detail that no users should know about. Note that having a single default argument overload potentially doubles the variations the user might need to keep in mind when choosing the right one. So there is value in simplicity. ================ Comment at: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp:248-254 + std::vector<SymbolRef> TaintedSyms = + getTaintedSymbols(errorState, TaintedSVal); + // Mark all tainted symbols interesting + // to track back the propagation of taintedness. + for (auto Sym : TaintedSyms) { + BR->markInteresting(Sym); + } ---------------- ================ Comment at: clang/lib/StaticAnalyzer/Checkers/DivZeroChecker.cpp:62 + CheckerContext &C, + std::vector<SymbolRef> TaintedSyms) const { + if (ExplodedNode *N = C.generateErrorNode(StateZero)) { ---------------- ================ Comment at: clang/lib/StaticAnalyzer/Checkers/DivZeroChecker.cpp:108-109 + if ((stateNotZero && stateZero)) { + std::vector<SymbolRef> taintedSyms = getTaintedSymbols(C.getState(), *DV); + if (!taintedSyms.empty()) { + reportTaintBug("Division by a tainted value, possibly zero", stateZero, C, ---------------- ================ Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:863-868 + std::vector<SymbolRef> TaintedSyms = + getTaintedSymbols(State, Call.getReturnValue()); + if (!TaintedSyms.empty()) { + TaintedSymbols.push_back(TaintedSyms[0]); + TaintedIndexes.push_back(ArgNum); + } ---------------- ================ Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:875-879 + std::vector<SymbolRef> TaintedSyms = getTaintedSymbols(State, *V); + if (!TaintedSyms.empty()) { + TaintedSymbols.push_back(TaintedSyms[0]); + TaintedIndexes.push_back(ArgNum); + } ---------------- In these cases, the code would acquire all the tainted subsymbols, which then we throw away and keep only the first one. This is why I suggested the approach I did I'm my last review. ================ Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:936-940 + IsMatching = + IsMatching || (PropSrcArgs.contains(I) && + isTaintedOrPointsToTainted(State, C.getSVal(E))); + std::optional<SVal> TaintedSVal = + getTaintedPointeeOrPointer(State, C.getSVal(E)); ---------------- Here `getTaintedPointeeOrPointer` would be called two times, unnecessarily. ================ Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:947-948 + if (!TaintedArgSyms.empty()) { + TaintedSymbols.insert(TaintedSymbols.begin(), TaintedArgSyms.begin(), + TaintedArgSyms.end()); + TaintedIndexes.push_back(I); ---------------- ================ Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:1009-1010 assert(E); - std::optional<SVal> TaintedSVal{getTaintedPointeeOrPointer(C, C.getSVal(E))}; + std::optional<SVal> TaintedSVal{ + getTaintedPointeeOrPointer(C.getState(), C.getSVal(E))}; ---------------- ================ Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:1019-1023 + std::vector<SymbolRef> TaintedSyms = + getTaintedSymbols(C.getState(), *TaintedSVal); + for (auto TaintedSym : TaintedSyms) { + report->markInteresting(TaintedSym); + } ---------------- ================ Comment at: clang/lib/StaticAnalyzer/Checkers/Taint.cpp:149 const LocationContext *LCtx, TaintTagType Kind) { - SVal val = State->getSVal(S, LCtx); - return isTainted(State, val, Kind); + return !getTaintedSymbols(State, S, LCtx, Kind, true).empty(); } ---------------- We usually pass booleans by "name". ================ Comment at: clang/lib/StaticAnalyzer/Checkers/Taint.cpp:294 + std::vector<SymbolRef> TaintedCasts = + getTaintedSymbols(State, SC->getOperand(), Kind); + TaintedSymbols.insert(TaintedSymbols.begin(), TaintedCasts.begin(), ---------------- If `returnFirstOnly` is `true`, this `getTaintedSymbols()` call would still eagerly (needlessly) collect all of the symbols. I'd recommend propagating the `returnFirstOnly` parameter to the recursive calls to avoid this problem. I also encourage you to make use of the `llvm::append_range()` whenever makes sense. ================ Comment at: clang/lib/StaticAnalyzer/Checkers/VLASizeChecker.cpp:230-232 + std::vector<SymbolRef> TaintedSyms = getTaintedSymbols(State, TaintedSVal); + for (auto Sym : TaintedSyms) + report->markInteresting(Sym); ---------------- Ah, how awesome it would be to have a `markInteresting(llvm::ArrayRef<SymbolRef>)` overload. ================ Comment at: clang/test/Analysis/taint-diagnostic-visitor.c:53-67 +//Tests if the originated note is correctly placed even if the path is +//propagating through variables and expressions +char* taintDiagnosticPropagation(){ + char *pathbuf; + char *pathlist=getenv("PATH"); // expected-note {{Taint originated here}} + // expected-note@-1 {{Taint propagated to the return value}} + if (pathlist){ // expected-note {{Assuming 'pathlist' is non-null}} ---------------- steakhal wrote: > I know this is subjective, but I'd suggest to reformat the tests to match > LLVM style guidelines, unless the formatting is important for the test. > Consistency helps the reader and reviewer, as code and tests are read many > more times than written. > > This applies to the rest of the touched tests. Originally I meant this to the rest of the test cases you change or add part of this patch. I hope it clarifies. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D144269/new/ https://reviews.llvm.org/D144269 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits