We have created a new branch for the incremental parsing work that Lawrence and I described at the last GCC Summit (http://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=IncrementalCompiler.pdf).
To get the branch: $ svn co svn+ssh://gcc.gnu.org/svn/gcc/branches/pph The purpose of the branch is to explore ways of speeding up C++ compilation. The approach is to convert the compiler into a server with a memory cache that will allow it to short circuit pre-processing and parsing for files in a translation unit that have been seen more than once. The server functionality still is not in the branch. I have taken the code from Tom Tromey's incremental branch and will be committing it in the next few days. The code currently implements a token cache on disk. This is currently enabled with -fpth (for Pre-Tokenized Headers). Each file in a translation unit gets its own .pth image. When a file is found unchanged wrt the .pth image, its tokens are instantiated out of the image instead of the text stream. This saves on average ~15% of compilation time on C++. PTH images are factored, so a change in one file does require building the complete PTH image for the whole TU. Additionally, each PTH file is segmented into token hunks, each of which can be validated and applied separately. This allows reusing the same PTH file in different translation units. The implementation is very primitive, so it breaks easily. On the parser side, we have only added some instrumentation to cp/parser.c to determine how effective can a parser caching scheme be given the parsing dependencies in C++ applications (that's the bulk of the results we presented at the Summit). We call this Pre-Parsed Headers (PPH). Most of the changes that we currently have in the parser are slated to disappear. We have included them in the initial branch in case anyone is interested in reproducing the results on their own code. There is a companion python script that builds the transitive closure of these dependencies to produce coverage results. If anyone is interested, I can produce a cleaned up copy of that script (it contains many internal references to our codebase, so I need to purify it). I will post 3 patches with each of the major areas we changed: libcpp, common gcc files and the C++ parser. Although the code is still rough, lacks some comments and documentation, we would appreciate feedback on the patches. As we discussed at the summit, the plan is to experiment with an implementation of pre-parsed headers to see if the benefits we expect are realizable. We only have plans to support the C++ front end, since that is the place where we are currently experience the biggest slow downs. C parsing is barely on the radar for us. However, if anyone is interested in porting Tom's C parsing changes to the branch, we will welcome it. Tom agreed to let us use the incremental compiler wiki page to host this work. We will be updating it soon.