ddcc added a comment.

> - That said, I think there are still significant optimization opportunities. 
> It looks like when performing a query you create a new solver for each set of 
> constraints. My understanding (and perhaps @nlopes can correct me here) is 
> that these days Z3 is quite good at incremental solves so you may be leaving 
> a lot of performance on the table. For example, in `getSymVal()` you create 
> one solver to query for a satisfying assignment and then later create a 
> completely new one to determine whether it is the only satisfying assignment. 
> Since those two queries share all of their conjuncts but one it might be 
> faster to reuse the first solver and add the new assertion for the second. 
> Similarly, since analyzer exploration is a DFS, you could imagine building up 
> the path condition incrementally. Z3 has solver APIs for pushing and popping 
> assertions. (Again, @nlopes may have a better idea of whether this would pay 
> off in practice.)

I agree that the performance is the main problem, and that there are still a 
lot of performance optimizations available. I've separated the Z3 solver and 
model into separate classes, and reuse the solver now in `getSymVal()` and 
`checkNull()`. I looked at using the push/pop approach briefly when I started 
writing the solver, but since the static analyzer itself seems to have an 
internal worklist, I wasn't sure if the state traversal order is deterministic 
and predictable, so I didn't try to reuse a single solver across all program 
states. This is the new timing:

Z3ConstraintManager (built-in, previous changes and solver state reuse): 202.37 
sec

> - It would be good to measure the increased peak memory usage with the Z3 
> constraint solver. The analyzer is often used as part of an IDE and so it 
> needs to be able to run in memory constrained environments (such as laptops). 
> Further, since multiple invocations of clang are often run simultaneously, 
> memory is often a more precious resource than CPU time.

On the following testcases, the memory usage increase is around 18%. Using GNU 
time, measuring the max RSS in kbytes:
edges-new.mm (RangeConstraintManager): 45184
edges-new.mm (Z3ConstraintManager): 53568
malloc-plist.c (RangeConstraintManager): 40264
malloc-plist.c (Z3ConstraintManager): 47888

> - As @a.sidorin noted, there is a significant test and maintenance cost to 
> keeping two constraint managers around. Since the testing matrix would 
> double, anything we can do to limit the need to modify/duplicate tests would 
> be a huge savings. Is it possible to use lit substitution to call the 
> analyzer twice for each run line: once with the range constraint manager and 
> once with z3 (for builds where z3 is requested)? This, combined with 
> automatically adding the #defines that @NoQ suggested would provide a 
> mechanism to share most tests between the constraint managers.

I think this is reasonable, the only remaining question is how to identify when 
lit should call the analyzer twice, e.g. match on patch against 
`test/Analysis`? Also, note that the argument passed to clang (`-Xclang 
-analyzer-constraints=z3` vs. `-analyzer-constraints=z3`) need to change 
depending on whether the actual compiler or just the compiler driver is invoked.

> - Right now the CMake uses find_package and builds with Z3 if it is found to 
> be installed on the building system. I think this is too aggressive. It would 
> be better to have the build explicitly opt in to using Z3. Without this, a 
> user could accidentally build a clang that dynamically links to their local 
> Z3 but then fails to launch with a load error when distributed. Similarly, it 
> would be good to be able to explicitly set the the location of the Z3 to be 
> used at build time for systems with multiple Z3s installed.

Done.


https://reviews.llvm.org/D28952



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to