vlad.tsyrklevich added inline comments.

================
Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:656-659
+  // If the SVal is a LazyCompoundVal it might only cover sub-region of a given
+  // symbol. For example, the LCV might represent a field in an uninitialized
+  // struct. In this case, the LCV encapsulates a conjured symbol and reference
+  // to the sub-region representing the struct field.
----------------
NoQ wrote:
> vlad.tsyrklevich wrote:
> > NoQ wrote:
> > > I still feel bad about producing API with very tricky pre-conditions. The 
> > > LCV may have various forms - some may have empty store with no symbols at 
> > > all, and others may be full of direct bindings that make the symbol 
> > > essentially irrelevant. However, because the taint API is designed to be 
> > > defensive to cases when taint cannot be added, and it sounds like a good 
> > > thing, i guess we've taken the right approach here :)
> > > 
> > > I suggest commenting this more thoroughly though, something like:
> > > 
> > > > If the SVal represents a structure, try to mass-taint all values within 
> > > > the structure. For now it only works efficiently on lazy compound 
> > > > values that were conjured during a conservative evaluation of a 
> > > > function - either as return values of functions that return structures 
> > > > or arrays by value, or as values of structures or arrays passed into 
> > > > the function by reference, directly or through pointer aliasing. Such 
> > > > lazy compound values are characterized by having exactly one binding in 
> > > > their captured store within their parent region, which is a conjured 
> > > > symbol default-bound to the base region of the parent region.
> > > 
> > > Then inside `if (Sym)`:
> > > 
> > > > If the parent region is a base region, we add taint to the whole 
> > > > conjured symbol.
> > > 
> > > > Otherwise, when the value represents a record-typed field within the 
> > > > conjured structure, so we add partial taint for that symbol to that 
> > > > field.
> > The pre-conditions for using the API are actually a bit simpler than what's 
> > exposed here. I made it explicit to make the logic for tainting LCVs 
> > explicit, but the following simpler logic works:
> > ```
> >   if (auto LCV = V.getAs<nonloc::LazyCompoundVal>()) {
> >     if (Optional<SVal> binding = 
> > getStateManager().StoreMgr->getDefaultBinding(*LCV)) {
> >       if (SymbolRef Sym = binding->getAsSymbol()) {
> >         return addPartialTaint(Sym, LCV->getRegion(), Kind);
> >       }
> >     }
> >   }
> > ```
> > 
> > This works because `addPartialTaint()` actually already performs the 
> > 'getRegion() == getRegion()->getBaseRegion()` check already and taints the 
> > parent symbol if the region is over the base region already. I chose to 
> > replicate it here to make the logic more explicit, but now that I've 
> > written this down the overhead of duplicating the logic seems unnecessary. 
> > Do you agree?
> > The pre-conditions for using the API are actually a bit simpler than what's 
> > exposed here.
> 
> I'm talking about the situation when we add the partial taint to the 
> default-bound symbol but it has no effect because there's a direct binding in 
> the lazy compound value on top of it. The user should ideally understand why 
> it doesn't work that way.
> 
> > I chose to replicate it here to make the logic more explicit, but now that 
> > I've written this down the overhead of duplicating the logic seems 
> > unnecessary. Do you agree?
> 
> The variant without the check looks easier to read, and it is kind of obvious 
> that full taint is a special case of partial taint, so i'm for removing the 
> check.
> when we add the partial taint to the default-bound symbol but it has no 
> effect because there's a direct binding in the lazy compound value on top of 
> it

Ah, so you're talking about the case where the LCV encompasses a value with 
both direct & default bindings, e.g. `int foo[1024]; foo[123] = rand(); 
taint(foo);`? In that case we will miss tainting the `rand` SymbolConjured. I 
suppose we could scan the region store for matching entries? In practice I 
think we're really only interested in tainting default bindings anyway for some 
unknown network/user input anyway.

Anyways, your comment makes more sense now. I've added it.


================
Comment at: lib/StaticAnalyzer/Core/ProgramState.cpp:788
+              (R == I->first || R->isSubRegionOf(I->first)))
+            return true;
+        }
----------------
NoQ wrote:
> I actually have no idea why does this function accumulate things in the 
> `Tainted` variable, instead of returning :)
It made a little more stylistic sense before this change, but not so much now. 
I've updated them to return immediately.


https://reviews.llvm.org/D30909



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to