Re: GCC static analysis branch now available on Compiler Explorer

Jeff Law Tue, 10 Dec 2019 12:43:21 -0800

On Tue, 2019-12-10 at 12:51 -0700, Martin Sebor wrote:
> On 12/10/19 10:56 AM, David Malcolm wrote:
> > On Tue, 2019-12-10 at 09:36 -0700, Martin Sebor wrote:
> > > On 12/10/19 8:46 AM, David Malcolm wrote:
> > > > For the adventurous/curious, my static analyzer branch of GCC
> > > > [1]
> > > > is
> > > > now available on Compiler Explorer (aka godbolt.org) so you can
> > > > try
> > > > it
> > > > out without building it yourself.  [Thanks to Matt Godbolt,
> > > > Patrick
> > > > Quist and others at the Compiler Explorer project]
> > > > 
> > > > On https://godbolt.org/ within the C and C++ languages, select
> > > >     "x86-64 gcc (static analysis)"
> > > > as the compiler (though strictly speaking only C is in-scope
> > > > right
> > > > now).  It's configured to automatically inject -fanalyzer (just
> > > > on
> > > > this
> > > > site; it isn't the default in the branch).
> > > > 
> > > > Some precanned examples:
> > > >     * various malloc issues: https://godbolt.org/z/tnx65n
> > > >     * stdio issues: https://godbolt.org/z/4BP-Tj
> > > >     * fprintf in signal handler: https://godbolt.org/z/ew7mW6
> > > >     * tainted data affecting control flow:
> > > > https://godbolt.org/z/3v8vSj
> > > >     * password-leakage: https://godbolt.org/z/pRPYiv
> > > > (the non-malloc examples are much more in "proof-of-concept"
> > > > territory)
> > > > 
> > > > Would it make sense to add an "analyzer" component to our
> > > > bugzilla,
> > > > even though this is still on a branch? (with me as default
> > > > assignee)
> > > 
> > > Yes, that would make sense to me for bugs/enhancements specific
> > > to
> > > just the analyzer.
> > > 
> > > I think it's important to have a shared understanding how
> > > requests
> > > for new warnings (either completely new features or enhancements
> > > to
> > > existing ones) should be triaged in Bugzilla.  For instance,
> > > pr82520
> > > is a request to enhance -Wreturn-local-addr to detect other forms
> > > of
> > > escaping dangling pointers besides by returning them from
> > > functions.
> > > I think the request (and a number of others) would be a good fit
> > > for
> > > the analyzer as well, but assigning it to the analyzer component
> > > would suggest that it's suitable only for it.  How do we track
> > > requests that we'd like handled in both?
> > 
> > It feels to me that the question of BZ component is a proxy for a
> > deeper scope question about what we want to handle in each
> > warning.  If
> > it's clear that we want an issue handled by both the core code
> > *and* by
> > the analyzer, then we should have two PRs.  Perhaps discuss in the
> > PRs
> > which things we expect to be handled by which warning (e.g. some
> > cases
> > might be capable of being caught by -Wreturn-local-addr, others
> > might
> > require the deeper analysis of -fanalyzer).
> 
> You're quite right that the two questions are related, and it is
> the latter that prompted my comment above.
> 
> My concern is that I'm not sure we have a good rule of thumb to
> decide whether a given warning is better suited to just the analyzer
> or just the middle-end, or that it should be implemented in both.
> Or even a good basis for making such decisions.  We have not
> discussed
> yet if we should even worry about overlap (duplicate warnings issued
> by both the analyzer and the middle-end).  I asked a few questions to
> help get more clarity here in my comments on the initial patch but
> only
> heard from Jeff.
> 
> So here are some of my thoughts.  I'd be very interested in yours (or
> anyone else's who cares about this subject).
> 
> I can think of no warning that's currently implemented in the middle-
> end (or that could be implemented there) that would not also be
> appropriate for the analyzer.  IIUC, it has the potential for finding
> more problems but may be more noisy than a middle-end implementation,
> and take longer to analyze code.  Are there any kinds of problems
> that
> you think are not appropriate for it?  Any inherent limitations?
> 
> At the same time I also think there's room for and value in providing
> at least some support in the middle-end for pretty much every warning
> that the analyzer already exposes options for.  They may not find as
> many problems as the analyzer eventually will but they don't have as
> much overhead (either to implement, or in terms of compile time --
> the analysis often benefits optimizations as well).  In addition, by
> being tied to the optimizers the detection provides an opportunity
> to also prevent the bugs (by folding invalid code away, or inserting
> traps or branches around it, perhaps under the control of users as we
> have discussed).
> 
> In my mind the only reason to have to choose between one and other
> is the duplication of effort and resource constraints: we don't have
> enough contributors into either area.  There isn't a lot we can about
> the latter but we should make an effort to minimize the duplication
> (starting with bug triage).
> 
> But maybe I'm overanalyzing it and playing it by ear will work just
> fine.
I think one of the key components will be whether or not one needs
intra-procedural state tracking to do a reasonable job.  THe double-
free checking is a great example.  You can build one with what's in the
tree today, but the result isn't really useful except for toy examples.


To do a reasonable job you've got to explode the graphs and track state
intra-procedurally.  Thus it's a good fit for David's analyzer.

THere will almost certainly be cases where some additional bit of
intra-procedural data tracking will help the existing middle end
analysis/warnings.    But what's in there today works and works
reasonably well.  So yes, some additional data might make
-Wuninitialized better, but it does a pretty good job with the existing
analysis without having to track data across function calls.

jeff

Re: GCC static analysis branch now available on Compiler Explorer

Reply via email to