Re: [whopr] Design/implementation alternatives for the driver and WPA

Chris Lattner Wed, 04 Jun 2008 09:15:24 -0700


On Jun 4, 2008, at 8:27 AM, Kenneth Zadeck wrote:

It is certainly not going to be possible to do this for all ipapasses, in particular any pass that requires the function body to bereanalyzed as part of the analysis pass will not be done, or will bedegraded so that it does not use this mechanism. But for a largenumber of passes this will work.
How this scales to google sized applications will have to be seen.The point is that there is a rich space with a complex set tradeoffsto be explored with lto. The decision to farm off the functionbodies to other processors because we "cannot" have all of thefunction bodies in memory will have a dramatic effect on what gcc/lto/whopr compilation will be able to achieve.

I agree with a lot of the sentiment that you express here Kenny. InLLVM, we've intentionally taken a very incremental approach:

1) start with all code in memory and see how far you can get. Itseems that on reasonable developer machines (e.g. 2GB memory) that wecan handle C programs on the order of a million lines of code, or C++code on the order of 400K lines of code without a problem with LLVM.

2) start leaving function bodies on disk, use lazily accesses, and acache manager to keep things in memory when needed. I think this willlet us scale to tens or hundreds of million line code bases them. Isee no reason to take a whopr approach just to be able to handle largeprograms.

Independent of program size is the efficiency of LTO. To me, allowinglto to scale and work well on 2 to 16 way shared memory machine is thefirst interesting order of business, just because that is what manydeveloper's have on their desk. Once that issue is nailed, goingacross a cluster is an interesting next step.

In the world I deal with, most code is built out of a large number ofmoderate sized libraries/plugins, not as a gigantic monolithic a.outfile. I admit that this shifts the emphasis we've been placing on tomaking things integration transparent, support for LTO across codebases with pieces missing, etc and not on support for ridiculouslyhuge code bases.

I guess one difference between the LLVM and GCC approaches stems fromthe "constant factor" order of magnitude of efficiency differencebetween llvm and gcc. If you can't reasonable hold a few hundredthousand lines of code in memory then you need more advancedtechniques in order to be generally usable for moderate-sized codebases.


-Chris

Re: [whopr] Design/implementation alternatives for the driver and WPA

Reply via email to