Hi David!

On 2019-11-15T20:22:47-0500, David Malcolm <dmalc...@redhat.com> wrote:
> This patch kit

(I have not looked at the patches.)  ;-)

> introduces a static analysis pass for GCC that can diagnose
> various kinds of problems in C code at compile-time (e.g. double-free,
> use-after-free, etc).

Sounds great from the description!


Would it make sense to add to the wiki page
<https://gcc.gnu.org/wiki/DavidMalcolm/StaticAnalyzer> a (high-level)
comparison to other static analyzers (Coverity, cppcheck,
clang-static-analyzer, others?), in terms of how they work, what their
respective benefits are, what their design goals are, etc.  (Of course
understanding that yours is much less mature at this point; talking about
high-level design rather than current implementation status.)

For example, why do we want that in GCC instead of an external tool -- in
part covered in your Rationale.  Can a compiler-side implementation
benefit from having more information available than an external tool?
GCC-side implementation is readily available (modulo GCC plugin
installation?) vs. external ones need to be installed/set up first.
GCC-side one only works with GCC-supported languages.  GCC-side one
analyzes actual code being compiled -- thinking about preprocessor-level
'#if' etc., which surely are problematic for external tools that are not
actually replicating a real build.  And so on.  (If you don't want to
spell out Coverity, cppcheck, clang-static-analyzer, etc., maybe just
compare yours to external tools.)

Just an idea, because I wondered about these things.


> The analyzer runs as an IPA pass on the gimple SSA representation.
> It associates state machines with data, with transitions at certain
> statements and edges.  It finds "interesting" interprocedural paths
> through the user's code, in which bogus state transitions happen.
>
> For example, given:
>
>    free (ptr);
>    free (ptr);
>
> at the first call, "ptr" transitions to the "freed" state, and
> at the second call the analyzer complains, since "ptr" is already in
> the "freed" state (unless "ptr" is NULL, in which case it stays in
> the NULL state for both calls).
>
> Specific state machines include:
> - a checker for malloc/free, for detecting double-free, resource leaks,
>   use-after-free, etc (sm-malloc.cc), and

I can immediately see how this can be useful for a bunch of
'malloc'/'free'-like etc. OpenACC Runtime Library calls as well as source
code directives.  ..., and this would've flagged existing code in the
libgomp OpenACC tests, which recently has given me some grief. Short
summary/examples:

In addition to host-side 'malloc'/'free', there is device-side (separate
memory space) 'acc_malloc'/'acc_free'.  Static checking example: don't
mix up host-side and device-side pointers.  (Both are normal C/C++
pointers.  Hmm, maybe such checking could easily be implemented even
outside of your checker by annotating the respective function
declarations with an attribute describing which in/out arguments are
host-side vs. device-side pointers.)

Then, there are functions to "map" host-side and device-side memory:
'acc_map_data'/'acc_unmap_data'.  Static checking example: you must not
'acc_free' memory spaces that are still mapped.

Then, there are functions like 'acc_create' (or equivalent directives
like '#pragma acc create') doing both 'acc_malloc', 'acc_map_data'
(plus/depending on internal reference counting).  Static checking
example: for a pointer returned by 'acc_create" etc., you must use
'acc_delete' etc. instead of 'acc_free', which first does
'acc_unmap_data' before interal 'acc_free' (..., and again all that
depending on reference counting).  (Might be "interesting" to teach your
checker about the reference counting -- if that is actually necessary;
needs further thought.)


> The checker is implemented as a GCC plugin.
>
> The patch kit adds support for "in-tree" plugins i.e. GCC plugins that
> would live in the GCC source tree and be shipped as part of the GCC tarball,
> with a new:
>   --enable-plugins=[LIST OF PLUGIN NAMES]
> configure option, analogous to --enable-languages (the Makefile/configure
> machinery for handling in-tree GCC plugins is adapted from how we support
> frontends).

I like that.  Implementing this as a plugin surely must help to either
document the GCC plugin interface as powerful/mature for such a task.  Or
make it so, if it isn't yet.  ;-)

> The default is for no such plugins to be enabled, so the default would
> be that the checker isn't built - you'd have to opt-in to building it,
> with --enable-plugins=analyzer

I'd favor a default of '--enable-plugins=default' which enables the
"usable" plugins.


> It's not clear to me whether I should focus on:
>
> (a) pruning the scope of the checker so that it works well on
> *intra*procedural C examples (and bail on anything more complex), perhaps
> targetting GCC 10 as an optional extra hidden behind
> --enable-plugins=analyzer, or
>
> (b) work on deeper interprocedural analysis (and fixing efficiency issues
> with this).

I personally would be happy to see (a) happen now (without "optional
extra hidden behind" configure-time flag but rather hidden behind GCC
command-line flag, '-fanalyze'?), and then (b) later on.  As
always, doing the incremental thing, (a) first, then (b) later, would
give it more exposure in the wild, which should help to identify design
issues etc. now instead of much later, for example.


> Thoughts?

One very high-level item: you're using the very generic name "analyzer"
('-Wanalyzer', 'gcc/analyzer/' filenames, for example, or the '-fanalyze'
I just proposed).  Might there be potential for confusion (now, or in the
future) what kind of analyzer that is, or is it safe to assume that in
context of a compiler, an analyzer is always a compile-time, static code
analyzer?  After all, the existing run-time ones are known as "checkers":
'-fstack-check', or "sanitizers": '--fsanitize, or "verifiers":
'-fvtable-verify'.


Grüße
 Thomas

Reply via email to