NoQ added a comment.
> In addition 2018 Bay Area LLVM Developers' Meetings may bring some new open
> projects :)
Actually, let's commit this before the conference, even if it's not perfect, so
that people who suddenly get inspired to work on Static Analyzer already had an
updated list :)
================
Comment at: www/analyzer/open_projects.html:29
+ <li>Implement a dataflow flamework.
+ <p><!-- TODO: Explain. -->
+ <i> (Difficulty: Hard) </i></p>
----------------
@george.karpenkov your turn here.
================
Comment at: www/analyzer/open_projects.html:33
+
+ <li>Handle aggregate construction.
+ <p>Aggregates are object that can be brace-initialized without calling a
----------------
Actually, maybe let's turn this into a single project with sub-bullets and list
all problems instead of just one.
Ok if i just dump my thoughts with markdown? >_<
Something like:
* Handle more ways of constructing C++ objects by identifying
`ConstructionContext`s in the `CFG` and using them to identify and track the
object until the permanent storage for the object is evaluated.
* Constructors of fields and [[since C++17] base classes] of aggregates. When
aggregates are brace-initialized, their own trivial constructor does not
happen, but the constructor for sub-objects does. It would be necessary to see
beyond the `InitListExpr` in order to find out which field of which aggregate
is actually constructed. Because brace initializers can be nested,
`ConstructionContext` for this constructor would potentially contain an
indefinite amount of intermediate `InitListExpr`s. Additionally, if the field
is of reference type, lifetime extension needs to be modeled.
* Constructors within `new[]`. Once an array of objects is allocated via
`operator new[]`, constructors for all elements of the array are called.
Arguably, we should evaluate at least a few of them, as if it was some sort of
loop. The same applies to destructors within `delete[]`. `ConstructionContext`
is already available here, so it is likely that there's no CFG work required,
though indicating the presence of a loopy control flow in the CFG might be
helpful.
* Constructors that can be elided due to Named Return Value Optimization
(NRVO). If a local variable within a function is of object type and is returned
by value on all return statements, then the compiler is allowed (but not
required to, even in C++17) to immediately store this variable at the address
into which the function would put the return value, thereby skipping
("eliding") copy/move constructor call (even if it has user-visible side
effects). Variables eligible for NRVO can be easily identified in the AST via
`VarDecl::isNRVOVariable()`, so no CFG work is necessary here, but Static
Analyzer needs to realize that the respecitive `VarRegion` should be entirely
replaced with the return region of the function for the whole duration of the
respective stack frame.
* Constructors of virtual method arguments. Constructors of function
arguments are already supported, but Static Analyzer has problems finding the
correct parameter variable declaration and the correct stack frame for the
object under construction if it doesn't know how exactly a virtual call is
devirtualized. It might help to simplify the identity of the parameter region
to exclude the `Decl` of the callee and its parameters from its identity.
* Constructors of default arguments. In C++ functions can have default values
for parameters that are re-computed at run-time every time the the function is
called without specifying the argument explicitly. There isn't much difference
between default arguments and normal arguments when it comes to
`ConstructionContext`-related problems. But the actual problem here is that the
expression that initializes the parameter is not part of the current stack
frame, but instead lives in the middle of nowhere within the parameter
declaration. This means that if two calls of the same function with defaulted
arguments are present within the same full-expression, we may accidentally
assign two different values (corresponding to two different default argument
objects) to the same expression in the Environment, which is incorrect. One of
the possible solutions is to define an additional `LocationContext` to
discriminate between different default argument evaluations, similarly to how
`StackFrameContext` allows discriminating between values of the same expression
on different layers of recursion.
* Constructors that perform lambda captures. If a lambda captures a variable
of object type by value, the object needs to be copied into the lambda, which
implies calling a copy-constructor. A new kind of `ConstructionContext` would
need to be defined in order to identify the memory region occupied by the
lambda object (most likely a `MaterializeTemporaryExpr` that surrounds the
`LambdaExpr`) and the sub-region that corresponds to the implicit field of the
lambda object that contains the capture.
(Difficulty: Medium)
================
Comment at: www/analyzer/open_projects.html:69
+ analyzer that calling <code>.length()</code> on an empty <code>std::string
+ </code> will yield 0, and vice versa, but supporting all of them is a huge
+ amount of work. One good thing to start with here would be to notice that
----------------
"supporting all of them" -> "modeling all methods in all classes of the
standard library"
================
Comment at: www/analyzer/open_projects.html:70
+ </code> will yield 0, and vice versa, but supporting all of them is a huge
+ amount of work. One good thing to start with here would be to notice that
+ inlining methods of C++ "containers" is currently outright forbidden in
----------------
Maybe move "One good thing to start..." into a sub-bullet?
================
Comment at: www/analyzer/open_projects.html:86-87
+ the opposite direction - integers to pointers - are completely unsupported.
+ Pointer-to-pointer casts are a mess; modeling them with <code><a
href="https://clang.llvm.org/doxygen/classclang_1_1ento_1_1ElementRegion.html">
+ ElementRegion </a></code> is a impractical, and we are suffering a lot from
+ this hack, but coming up with a significantly better solution is very hard,
----------------
"Pointer-to-pointer casts are also causing a surprising amount of problems. The
decision to re-use `ElementRegion` to track the type of the object behind the
pointer value seems to create more problems than it solves, but..."
================
Comment at: www/analyzer/open_projects.html:109-116
<li>Explicitly model standard library functions with <tt>BodyFarm</tt>.
<p><tt><a
href="http://clang.llvm.org/doxygen/classclang_1_1BodyFarm.html">BodyFarm</a></tt>
allows the analyzer to explicitly model functions whose definitions are
not available during analysis. Modeling more of the widely used functions
(such as the members of <tt>std::string</tt>) will improve precision of the
analysis.
+ <i>(Difficulty: Easy)</i><p>
----------------
Let's move this into a sub-bullet of "Enhance the modeling of the standard
library." and remove the part about `std::string`, but instead say "This is
often effective for simple C functions such as atomic builtins and certain
related threading functions (such as `call_once`) because atomicity doesn't
have any meaning for Static Analyzer and can be completely ignored. This
approach is not recommended for functions dealing with complicated C++ code -
temporaries, templates - unless you have unbelievably good AST handwriting and
recite all supported versions of the C++ standard by heart."
================
Comment at: www/analyzer/open_projects.html:144
<li>Enhance CFG to model C++ <code>delete</code> more precisely.
<p>Similarly, the representation of <code>delete</code> does not include
----------------
Let's move this into the construction context bullet. I think we do call the
destructor now but it might be that we don't call the custom delete operator.
================
Comment at: www/analyzer/open_projects.html:177
+
+ <li>Make the values in <code><a
href="https://clang.llvm.org/doxygen/classclang_1_1ento_1_1Environment.html">
+ Environment</a></code> immutable
----------------
Let's mark this with "Refactoring: ".
================
Comment at: www/analyzer/open_projects.html:308
<li>Implement a BitwiseMaskingChecker to handle <a
href="http://llvm.org/bugs/show_bug.cgi?id=16615">PR16615</a>.
+ <p>Symbolic expressions of the form <code>$sym & CONSTANT</code> can
----------------
This should be removed because we already have a bullet for this.
https://reviews.llvm.org/D53024
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits