Re: [PATCH] Add capability to run several iterations of early optimizations

Matt Fri, 28 Oct 2011 16:07:07 -0700

On Sat, 29 Oct 2011, Maxim Kuvyrkov wrote:

I like this variant a lot better than the last one - still it lacks any
analysis-based justification for iteration (see my reply to Matt on
what I discussed with Honza).
Yes, having a way to tell whether a function have significantly changedwould be awesome. My approach here would be to make inline_parametersoutput feedback of how much the size/time metrics have changed for afunction since previous run. If the change is above X%, then queuefunctions callers for more optimizations. Similarly, Martin'srebuild_cgraph_edges_and_devirt (when that goes into trunk) could queuenew direct callees and current function for another iteration if newdirect edges were resolved.

Figuring out the heuristic will need decent testing on a few projects tofigure out what the "sweet spot" is (smallest binary for time/passesspent) for that given codebase. With a few data points, a reasonable stabat the metrics you mention can be had that would not terminate theiterations before the known optimial number of passes. Without those datapoints, it seems like making sure the metrics allow those "sweet spots" tobe attained will be difficult.

 Thus, I don't think we want to
merge this in its current form or in this stage1.
What is the benefit of pushing this to a later release? If anything,merging the support for iterative optimizations now will allow us toconsider adding the wonderful smartness to it later. In the meantime,substituting that smartness with a knob is still a great alternative.

I agree (of course). Having the knob will be very useful for testing anddetermining the acceptance criteria for the later "smartness". Whileterminating early would be a nice optimization, the feature is stillintrinsically useful and deployable without it. In addition, when usingLTO on nearly all the projects/modules I tested on, 3+ passes werealways productive. To be fair, when not using LTO, beyond 2-3 passes didnot often produce improvements unless individual compilation units wereenormous.

There was also the question of if some of the improvements seen withmultiple passes were indicative of deficiencies in early inlining, CFG,SRA, etc. If the knob is available, I'm happy to continue testing on thesame projects I've filed recent LTO/graphite bugs against (glib, zlib,openssl, scummvm, binutils, etc) and write a report on what I observe as"suspicious" improvements that perhaps should be caught/made in a singlepass.

It's worth noting again that while this is a useful feature in and ofitself (especially when combined with LTO), it's *extremely* useful whencoupled with the de-virtualization improvements submitted in otherthreads. The examples submitted for inclusion in the test suite aren'tacademic -- they are reductions of real-world performance issues from amature (and shipping) C++-based networking product. Any C++ codebase thatemploys physical separation in their designs via Factory patterns,Interface Segregation, and/or Dependency Inversion will likely seeimprovements. To me, these enahncements combine to form one of the biggestleaps I've seen in C++ code optimization -- code that can be clean, OO,*and* fast.

Richard: If there's any additional testing or information I can reasonablyprovide to help get this in for this stage1, let me know.


Thanks!


--
tangled strands of DNA explain the way that I behave.
http://www.clock.org/~matt

Re: [PATCH] Add capability to run several iterations of early optimizations

Reply via email to