vlad.tsyrklevich added inline comments.
================ Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:656-659 + // If the SVal is a LazyCompoundVal it might only cover sub-region of a given + // symbol. For example, the LCV might represent a field in an uninitialized + // struct. In this case, the LCV encapsulates a conjured symbol and reference + // to the sub-region representing the struct field. ---------------- NoQ wrote: > vlad.tsyrklevich wrote: > > NoQ wrote: > > > I still feel bad about producing API with very tricky pre-conditions. The > > > LCV may have various forms - some may have empty store with no symbols at > > > all, and others may be full of direct bindings that make the symbol > > > essentially irrelevant. However, because the taint API is designed to be > > > defensive to cases when taint cannot be added, and it sounds like a good > > > thing, i guess we've taken the right approach here :) > > > > > > I suggest commenting this more thoroughly though, something like: > > > > > > > If the SVal represents a structure, try to mass-taint all values within > > > > the structure. For now it only works efficiently on lazy compound > > > > values that were conjured during a conservative evaluation of a > > > > function - either as return values of functions that return structures > > > > or arrays by value, or as values of structures or arrays passed into > > > > the function by reference, directly or through pointer aliasing. Such > > > > lazy compound values are characterized by having exactly one binding in > > > > their captured store within their parent region, which is a conjured > > > > symbol default-bound to the base region of the parent region. > > > > > > Then inside `if (Sym)`: > > > > > > > If the parent region is a base region, we add taint to the whole > > > > conjured symbol. > > > > > > > Otherwise, when the value represents a record-typed field within the > > > > conjured structure, so we add partial taint for that symbol to that > > > > field. > > The pre-conditions for using the API are actually a bit simpler than what's > > exposed here. I made it explicit to make the logic for tainting LCVs > > explicit, but the following simpler logic works: > > ``` > > if (auto LCV = V.getAs<nonloc::LazyCompoundVal>()) { > > if (Optional<SVal> binding = > > getStateManager().StoreMgr->getDefaultBinding(*LCV)) { > > if (SymbolRef Sym = binding->getAsSymbol()) { > > return addPartialTaint(Sym, LCV->getRegion(), Kind); > > } > > } > > } > > ``` > > > > This works because `addPartialTaint()` actually already performs the > > 'getRegion() == getRegion()->getBaseRegion()` check already and taints the > > parent symbol if the region is over the base region already. I chose to > > replicate it here to make the logic more explicit, but now that I've > > written this down the overhead of duplicating the logic seems unnecessary. > > Do you agree? > > The pre-conditions for using the API are actually a bit simpler than what's > > exposed here. > > I'm talking about the situation when we add the partial taint to the > default-bound symbol but it has no effect because there's a direct binding in > the lazy compound value on top of it. The user should ideally understand why > it doesn't work that way. > > > I chose to replicate it here to make the logic more explicit, but now that > > I've written this down the overhead of duplicating the logic seems > > unnecessary. Do you agree? > > The variant without the check looks easier to read, and it is kind of obvious > that full taint is a special case of partial taint, so i'm for removing the > check. > when we add the partial taint to the default-bound symbol but it has no > effect because there's a direct binding in the lazy compound value on top of > it Ah, so you're talking about the case where the LCV encompasses a value with both direct & default bindings, e.g. `int foo[1024]; foo[123] = rand(); taint(foo);`? In that case we will miss tainting the `rand` SymbolConjured. I suppose we could scan the region store for matching entries? In practice I think we're really only interested in tainting default bindings anyway for some unknown network/user input anyway. Anyways, your comment makes more sense now. I've added it. ================ Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:788 + (R == I->first || R->isSubRegionOf(I->first))) + return true; + } ---------------- NoQ wrote: > I actually have no idea why does this function accumulate things in the > `Tainted` variable, instead of returning :) It made a little more stylistic sense before this change, but not so much now. I've updated them to return immediately. https://reviews.llvm.org/D30909 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits