> On Fri, Jun 25, 2010 at 8:15 AM, Jonathan Adamczewski > <jadam...@utas.edu.au> wrote: > > On 25/06/10 06:39, Richard Guenther wrote: > >> There are btw. some bugs wrt accounting of functions called once > >> being inlined in 4.5 which were fixed on trunk which allow extra > >> inlining. > >> > > > > Are these changes likely to make it onto the 4.5 branch and into (say) > > 4.5.1? > > Well, I'm always a bit nervous when backporting inline heuristic > changes as that may trigger latent problems on code where they > weren't seen before. > > We are talking about revs 158278 and 159931. And at this point > I'd leave it to Honza to consider their safety and do and test a > backport.
Main change in GCC 4.5 heuristic is that it is no longer driven by somewhat fuzzy estimates of costs that are mixture of size, speed and some legacy (such as bug completely ignoring existence of loads and stores). It now uses code size estimate and speedup to drive inlining (that is basically greedy algorithm trying to maximize speedup at the code size growth constrains). When you compile with -Os, the inlining happens only when code size reduces. Thus we pretty much care about the code size metrics only. I suspect the problem here might be that normal C++ code needs some inlining to make abstraction penalty go away. GCC -Os implementation is generally tuned for CSiBE and it is somewhat C centric (that makes sense for embedded world). As a result we might get quite noticeable slowdowns on C++ apps compiled with -Os (and code size growth too since abstraction is never eliminated). It can be seen also at tramp3d (Pooma testcase) where -Os produces a lot bigger and a lot slower code. I would be very interested to know the most obvious cases where we miss inlining and should not. It would be most helpful to directly know -fdump-tree-inline_param-details for those or have self contained testcase. It might be for benefit of both projects if we managed to set up regular mozilla benchmarking. (Simlar as we do for C++ benchmarks at http://gcc.opensuse.org/c++bench-frescobaldi/ ) I was thinking about this up for a while but was somewhat discougrated by the overall complexity of Mozilla and also currently we lack hardware for all the testing we would like to do. Mozilla is wonderful example of complex real world C++ APP with a benchmark suite, so it makes it really good target for tunning IPA. I would be also very interested to know how profile feedback works in this case (and why it does not work in previous releases). I am maintaining both areas of compiler and would be definitly happy to do some work to help to make it useful for you. GCC 4.6 has several changes in inlining heruistics that might be considered for backporting if they are found to be _really_ important. Most noticeable are probably: 1) It fixes miscounting of variadic functios (this had quite large effect on GCC itself since it prevents inlining parts of fatal_error) 2) It fixes accounting of static functions (previously the overall unit change was decreased twice for every offline copy eliminated, that accidentally imroved codegen for some C++ testcases but caused code size growth eslewhere) 3) Priority queue was fixed, so it is now accoutning correctly cost changes after inlining (this caused best improvements in C) 4) There was speedups in inlining heruristics when delaing with functions having realy many (say over 50000) callers. 2) and 3) needs to go together or we get slowdonws on our current C++ suite. I am however concerned that the problem might be clash in between -Os and the fact that C++ code generally needs speculative code growing inlining to get rid of abstraction. It depends what your abstraction is to see if we can get somehow easilly around this problem. GCC can detect certain form of constructs that will go away after inlining and I was also thining about adding small code growth buffer for -Os inlining too if it helps at average. Honza > > Richard. > > > j. > >