On Fri, 2021-01-22 at 20:46 +0530, Adharsh Kamath wrote:
> Hi David. Thank you for the reply.
> On Tue, Jan 19, 2021 at 2:12 AM David Malcolm <dmalc...@redhat.com>
> wrote:
> > On Thu, 2021-01-14 at 10:45 +0530, Adharsh Kamath wrote:
> > > Hello,
> > > I came across the list of possible project ideas for GSoC 2021
> > > and
> > > I'd
> > > like to contribute to the project regarding the static analysis
> > > pass
> > > in GCC.
> > > How can I get started with this project?
> > 
> > Hi Adharsh
> > 
> > Sorry about the delay in responding to your email.
> > 
> > Thanks for your interest in the static analysis pass.
> > 
> > Some ideas on getting started with GCC are here:
> >   https://gcc.gnu.org/wiki/SummerOfCode#Before_you_apply
> > 
> > The analyzer has its own wiki page here:
> >   https://gcc.gnu.org/wiki/DavidMalcolm/StaticAnalyzer
> 
> I examined the analyzer dumps for a few programs. I also read the
> documentation on the internals of
> the static analyzer and I've understood the basics of how the
> analyzer works.

Excellent.  Building GCC from source and stepping through it in the
debugger would be good next steps.  You'll need plenty of disk space.
 "run_checkers" is a good breakpoint to set if you're looking for the
entrypoint to the analyzer.

> > I've actually already implemented some of the ideas that were on
> > the
> > GSoC wiki page myself since last summer, so I've updated that page
> > accordingly:
> >   
> > https://gcc.gnu.org/wiki/SummerOfCode?action=diff&rev2=187&rev1=184
> > I've added the idea of SARIF ( https://sarifweb.azurewebsites.net/
> > ) as
> > an output format for the static analyzer (and indeed, for the GCC
> > diagnostics subsystem as a whole).
> > 
> > Do any of the ideas on the page look appealing to you?  I'm open to
> > other ideas you may have relating to the analyzer, or indeed to gcc
> > diagnostics.
> 
> Yes. Making a plugin for the Linux kernel seems very interesting to
> me.
> I'd also like to extend support for C++ but I'm not sure if both
> ideas would be
> possible, given the time constraints.

I think that picking just one would be better than trying to do both.

> How do I start with the plugin for
> the Linux kernel?

I added plugin support to the analyzer in:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=66dde7bc64b75d4a338266333c9c490b12d49825

There's an example plugin in that patch.  The kernel source tree
already has some plugins, so hopefully those together give some
pointers on how to write a "hello world" analyzer plugin that runs as
part of the kernel build, which would be another next step, I guess.

Unfortunately I'm not a Linux kernel developer, so I don't have deep
knowledge of what checks would be useful and the subtle details that
are likely to be necessary.  I'll try to reach out internally within
Red Hat - we have plenty of kernel developers here.

Some ideas:
* detecting code paths that acquire a lock but then fail to release it
* detecting code paths that disable interrupts and then fail to re-
enable them
* detecting mixups between user-space pointers and kernel-space
pointers

The kernel has its own checker called "smatch" which may give other
ideas for warnings.

The state machine checker in the analyzer takes its inspiration from
the Stanford "MC" checker (among other places, such as typestate),
which has been used to implement warnings for the Linux kernel, albeit
some very old versions of the kernel.

See::
  * "How to write system-specific, static checkers in Metal" (Benjamin
Chelf, Dawson R Engler, Seth Hallem), from 2002
  * "Checking system rules using system-specific, programmer-written
compiler extensions" Proceedings of Operating Systems Design and
Implementation (OSDI), September 2000. D. Engler, B. Chelf, A. Chou,
and S. Hallem.
  * "Using Programmer-Written Compiler Extensions to Catch Security
Holes" (Ken Ashcraft, Dawson Engler) from 2002

These are working on 20-year-old in-kernel APIs that might be obsolete
now, but they have examples of interrupt checking, and user-space vs
kernel-space pointer checking.

Focusing on error-handling paths in driver code might be best.

Does this answer your questions?

Hope this sounds interesting as a project
Dave


Reply via email to