On Sun, 2022-04-03 at 17:56 +0200, Tim Lange wrote: > Hi everyone, > Hi David,
Hi Tim > > I'm interested in extending the static analysis pass as a GSoC > project. Excellent. > Short introduction of me: I'm Tim, currently doing my master in > computer science with focus on IT security at TU Darmstadt. I already > worked with IFDS as part of my bachelor thesis and took both program > analysis courses in my masters. I suspect you know much more about the theory than I do :) (my bachelor's degree only had an optional course on compilers, let alone static analysis) > > Specifically, I thought about extending the analyzer to check new > things i.e. the POSIX file-descriptor project. I would prefer a > medium-sized project, do you think this is doable? I think so. > > Also, I've read a bit through the internal documentation and got a > question. The documentation mostly mentions the Reps paper as the > source for the exploded supergraph. But the paper suggests more such > as > having extensional summaries that lead to the same context > sensitivity > as unbounded call-strings. In contrast, the documentation also talks > about being call-strings limited and searching for complex-enough > methods to be worth summarizing. > Do you only use the exploded supergraph idea but not the rest? > Otherwise why do you use call-strings and search for methods worth > summarizing? > I think I'm abusing the "exploded graph" terminology from the paper. As I understand the Reps et al paper, they're tracking quite fine- grained dataflow facts at their nodes, whereas the state objects in GCC's -fanalyzer are very coarse-grained, containing a blob of state: the store, tracking memory, a set of constraints, and a set of state- machine info. So I'm merely using "exploded graph" to express that I'm combining program points and program states into one graph-based representation, and that I'm using graph reachability - but I'm not doing the IFDS approach (or the "IDE" approach from the followup paper). As I understand things, the analyzer is using symbolic execution to try to explore a subset of the paths through the code, with some heuristics to try to explore the most "interesting" paths, eventually giving up when we hit complexity limits. My hope is that by focusing on specific states, that I can generate meaningful descriptions about what's going on to the end-user. The call summarization logic in the analyzer is very much just a placeholder right now; there's plenty of scope for improving it (which could itself be a GSoC project, but I think that one would be a large one). Hope this all makes sense, and that the analyzer sounds interesting; let me know if you have other questions. Dave > Best regards > Tim >