Since nobody answered to Richard, and I find the discussion interesting to understand what the future of GCC might be
> Our high-level AST is language specific. In case of C++ its GENERIC plus > some C++ specific tree codes. There is no framework for building a CFG > on top of that (not sure if you need that), but the cgraph is built over that > representation. C/C++ GENERIC does not accurately represent the original source code, and I understand that this is on purpose (or at least, it is not a goal). This is one of the major criticisms of GCC that (supposedly) led to the development of Clang (see the first 20 minutes of http://channel9.msdn.com/Events/GoingNative/GoingNative-2012/Clang-Defending-C-from-Murphy-s-Million-Monkeys ). > Of course non-optimizing ASTs will limit static analysis to TU scope, even > with clang? Or does clang support a "LTO" source AST? It seems it does: http://clang.llvm.org/doxygen/Index_8h.html http://clang.llvm.org/doxygen/dir_e9b826b1b01168f6fc5ffb2b00be9311.html And even if it didn't, it is a clearly expressed goal of Clang to support such uses. Quoting from http://clang.llvm.org/features.html#diverseclients "The problem with this goal is that different clients have very different requirements. Consider code generation, for example: a simple front-end that parses for code generation must analyze the code for validity and emit code in some intermediate form to pass off to a optimizer or backend. Because validity analysis and code generation can largely be done on the fly, there is not hard requirement that the front-end actually build up a full AST for all the expressions and statements in the code. TCC and GCC are examples of compilers that either build no real AST (in the former case) or build a stripped down and simplified AST (in the later case) because they focus primarily on codegen. On the opposite side of the spectrum, some clients (like refactoring) want highly detailed information about the original source code and want a complete AST to describe it with. Refactoring wants to have information about macro expansions, the location of every paren expression '(((x)))' vs 'x', full position information, and much more. Further, refactoring wants to look across the whole program to ensure that it is making transformations that are safe. Making this efficient and getting this right requires a significant amount of engineering and algorithmic work that simply are unnecessary for a simple static compiler." Cheers, Manuel.