[PATCH] D122838: [clang][dataflow] Add support for correlation of boolean (tracked) values

Gábor Horváth via Phabricator via cfe-commits Fri, 01 Apr 2022 08:00:13 -0700

xazax.hun accepted this revision.
xazax.hun added inline comments.


================
Comment at: clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp:79
+    for (BoolValue *Constraint : Env1.getFlowConditionConstraints()) {
+      Expr1 = &Env1.makeAnd(*Expr1, *Constraint);
+    }
----------------
ymandel wrote:
> sgatev wrote:
> > xazax.hun wrote:
> > > Hmm, interesting.
> > > I think we view every boolean formula at a certain program point 
> > > implicitly as `FlowConditionAtThatPoint && Formula`. And the flow 
> > > condition at a program point should already be a disjunction of its 
> > > predecessors.
> > > 
> > > So it would be interpreted as: `(FlowConditionPred1 || 
> > > FlowConditionPred2) && (FormulaAtPred1 || FormulaAtPred2)`.
> > > While this is great, this is not the strongest condition we could derive. 
> > > `(FlowConditionPred1 && FormulaAtPred1)  || (FormulaAtPred2 && 
> > > FlowConditionPred2)` created by this code snippet is stronger which is 
> > > great.
> > > 
> > > My main concern is whether we would end up seeing an exponential 
> > > explosion in the size of these formulas in the number of branches 
> > > following each other in a sequence.
> > > 
> > Yeap, I agree this is suboptimal and I believe I'm the one to blame for 
> > introducing it downstream.
> > 
> > I wonder if we can represent the flow condition of each environment using a 
> > bool atom and have a mapping of bi-conditionals between flow condition 
> > atoms and flow condition constraints. Something like:
> > 
> > ```
> > FC1 <=> C1 ^ C2
> > FC2 <=> C2 ^ C3 ^ C4
> > FC3 <=> (FC1 v FC2) ^ C5
> > ... 
> > ```
> > 
> > We can use that to simplify the formulas here and in `joinConstraints`. The 
> > mapping can be stored in `DataflowAnalysisContext`. We can track 
> > dependencies between flow conditions (e.g. above `FC3` depends on `FC1` and 
> > `FC2`) and modify `flowConditionImplies` to construct a formula that 
> > includes the bi-conditionals for all flow condition atoms in the transitive 
> > set before invoking the solver.
> > 
> > I suggest putting the optimization in its own patch. I'd love to look into 
> > it right after this patch is submitted if both of you think it makes sense 
> > on a high level.
> This sounds good to me. That said, I'm not sure how often we'd expect this to 
> be an issue in practice, since, IIUC, this specialized merge only occurs when 
> the value is handled differently in the two branches. So, a series of 
> branches alone won't trigger the exponential behavior.
Fair point. It might never cause a problem in practice. Since we already have 
one proposal how to try to fix this if a problem ever surfaces, I'd love to 
have a comment about this in the code. But I think this is something that 
probably does not need to be addressed in the near future. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122838/new/

https://reviews.llvm.org/D122838

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D122838: [clang][dataflow] Add support for correlation of boolean (tracked) values

Reply via email to