> It was. Unfortunately,work on it stopped last year and it is unlikely > that I will be assigned to this again. I still have some personal > interest on the feature, but given time restrictions, we should make > contingency plans. > > Perhaps the easiest option is to remove the feature. WHOPR does not > represent a lot of code over the basic LTO framework, so this should > be relatively easy and non-intrusive.
I would preffer completing this feature rather then removing it. I plan to work on cleaning up the basic infrastructure (i.e. making LTRANS stage working correctly and really read in what optimizations at WPA stage decided) during this stage1. (This week my thesis really converged, I should submit in two weeks or so, so this seems realistic plan ;) Here is brief outline what I would like to do: Main problem of current WHOPR implementation is that is is quite broken in a way WPA->LTRANS streaming works. We don't do anything but inlining at WPA and we don't stream out result if IP propagation to LTRANS and instead LTRANS re-do everything locally. So we need: 1) Update visibility stuff in callgraph so it works on partial units in LTRANS (i.e. we should know that some function body was available, is in other LTRANS unit and we do have following summary knowledge about (such as IPA-PTA sets) that are useful for compilation of current unit) 2) Fix WPA partitioning versus to clones, it is completely broken 3) Make callgraph streamer to stream callgraph with results of IPA optimization. That is we need, for example, to stream inline_failed reasons from WPA->LTRANS, but we don't need to stream it in LTO. At the moment we stream it always. On the other hand we never stream info about clones that we must stream in WPA->LTRANS too. 4) Fix passmanager. The stuff should work as follows: Compilation: Early optimization, analysis of all IPA passes WPA Propagation of all IPA passes Ltrans Transofrmation of all IPA passes rest of compilation Currently we do something like this: Compilation: Early optimization, analysis of all IPA passes WPA Propagation for inliner alone Ltrans Throw away inlining decisions made Do analysis, propagation and transformation of all IPA pases rest of compilation As we have read/write methods for summaries, we need read/write methods for optimization summaries and we need some PM cleanups to integrate it better with compilation process. 5) To get some benefits we need to get back the parallel compilation of ltrans etc. etc. So as time allows, I would like to work on these items pretty much in this order. Of course I would be happy if someone beats me. > > If we do keep it, my expectation was to convert WHOPR into a > profile-guided feature, much like what LIPO does. Instead of trying > to statically decide what TUs to compile together, we generate gimple > bytecode for all TUs, then link normally (without combining them) and > run the binary with profile generation enabled. > > During the second compilation, we use the profile generated to decide > what TUs to link together. This avoids the complexity we currently > have with WHOPR, which needs to re-exec itself and make static > decisions. Well, I think this is independent. It makes a lot of sense to make profiling to work in a way so instrumentation happens at linktime with LTO and we can read stuff back. This is relatively easy to do: we need to rewrite profiling pass to work on SSA (that is easy and desirable anyway and on my TODO for a while). Then we need to split gcov and -fprofile-generate profiling passes. Gcov needs to happen early since early optimization will optimize out dead code and we want to count it. Profile generation needs to be done late since we want it at linktime. Then we will be able to do profiling at LTO. With WHOPR I expect situation to be bit funnier, since one does not have the function bodies and thus it is difficult to read in the profile, get callgraph profile and do something about it. I guess one solution is to add callgraph profiling into our instrumentation or make it possible to read CFG of each function at WHOPR stage and do profile read. Didn't had much time to think this over. > > Additionally, removing the current WHOPR code will not affect that plan. > > The first target I would shoot for, however, is to replace -combine with > -flto. :) We need debug info and hammer out all bugs of course! I would also like to see possiblity to LTO bootstrap without gold and possibility to not generate assembly into LTO .o files. In the typical use where one builds app with LTO (such as bootstrap) this just slow down everything approximately twice that is IMO silly. Honza > > > Diego.