On Sun, 2022-04-03 at 17:56 +0200, Tim Lange wrote:
> Hi everyone,
> Hi David,

Hi Tim

> 
> I'm interested in extending the static analysis pass as a GSoC
> project.

Excellent.

> Short introduction of me: I'm Tim, currently doing my master in 
> computer science with focus on IT security at TU Darmstadt. I already
> worked with IFDS as part of my bachelor thesis and took both program 
> analysis courses in my masters.

I suspect you know much more about the theory than I do :)

(my bachelor's degree only had an optional course on compilers, let
alone static analysis)

> 
> Specifically, I thought about extending the analyzer to check new 
> things i.e. the POSIX file-descriptor project. I would prefer a 
> medium-sized project, do you think this is doable?

I think so.

> 
> Also, I've read a bit through the internal documentation and got a 
> question. The documentation mostly mentions the Reps paper as the 
> source for the exploded supergraph. But the paper suggests more such
> as 
> having extensional summaries that lead to the same context
> sensitivity 
> as unbounded call-strings. In contrast, the documentation also talks 
> about being call-strings limited and searching for complex-enough 
> methods to be worth summarizing.
> Do you only use the exploded supergraph idea but not the rest? 
> Otherwise why do you use call-strings and search for methods worth 
> summarizing?
> 

I think I'm abusing the "exploded graph" terminology from the paper. 
As I understand the Reps et al paper, they're tracking quite fine-
grained dataflow facts at their nodes, whereas the state objects in
GCC's -fanalyzer are very coarse-grained, containing a blob of state:
the store, tracking memory, a set of constraints, and a set of state-
machine info.

So I'm merely using "exploded graph" to express that I'm combining
program points and program states into one graph-based representation,
and that I'm using graph reachability - but I'm not doing the IFDS
approach (or the "IDE" approach from the followup paper).

As I understand things, the analyzer is using symbolic execution to try
to explore a subset of the paths through the code, with some heuristics
to try to explore the most "interesting" paths, eventually giving up
when we hit complexity limits.  My hope is that by focusing on specific
states, that I can generate meaningful descriptions about what's going
on to the end-user.

The call summarization logic in the analyzer is very much just a
placeholder right now; there's plenty of scope for improving it (which
could itself be a GSoC project, but I think that one would be a large
one).

Hope this all makes sense, and that the analyzer sounds interesting;
let me know if you have other questions.

Dave


> Best regards
> Tim
> 


Reply via email to