Re: [PATCH] Add capability to run several iterations of early optimizations

Richard Guenther Wed, 12 Oct 2011 04:58:35 -0700

On Wed, Oct 12, 2011 at 8:50 AM, Maxim Kuvyrkov <ma...@codesourcery.com> wrote:
> The following patch adds new knob to make GCC perform several iterations of 
> early optimizations and inlining.
>
> This is for dont-care-about-compile-time-optimize-all-you-can scenarios.  
> Performing several iterations of optimizations does significantly improve 
> code speed on a certain proprietary source base.  Some hand-tuning of the 
> parameter value is required to get optimum performance.  Another good use for 
> this option is for search and ad-hoc analysis of cases where GCC misses 
> optimization opportunities.
>
> With the default setting of '1', nothing is changed from the current status 
> quo.
>
> The patch was bootstrapped and regtested with 3 iterations set by default on 
> i686-linux-gnu.  The only failures in regression testsuite were due to latent 
> bugs in handling of EH information, which are being discussed in a different 
> thread.
>
> Performance impact on the standard benchmarks is not conclusive, there are 
> improvements in SPEC2000 of up to 4% and regressions down to -2%, see [*].  
> SPEC2006 benchmarks will take another day or two to complete and I will 
> update the spreadsheet then.  The benchmarks were run on a Core2 system for 
> all combinations of {-m32/-m64}{-O2/-O3}.
>
> Effect on compilation time is fairly predictable, about 10% compile time 
> increase with 3 iterations.
>
> OK for trunk?


I don't think this is a good idea, especially in the form you implemented it.

If we'd want to iterate early optimizations we'd want to do it by iterating
an IPA pass so that we benefit from more precise size estimates
when trying to inline a function the second time.  Also statically
scheduling the passes will mess up dump files and you have no
chance of say, noticing that nothing changed for function f and its
callees in iteration N and thus you can skip processing them in
iteration N + 1.

So, at least you should split the pass_early_local_passes IPA pass
into three, you'd iterate over the 2nd (definitely not over pass_split_functions
though), the third would be pass_profile and pass_split_functions only.
And you'd iterate from the place the 2nd IPA pass is executed, not
by scheduling them N times.

Then you'd have to analyze the compile-time impact of the IPA
splitting on its own when not iterating.  Then you should look
at what actually was the optimizations that were performed
that lead to the improvement (I can see some indirect inlining
happening, but everything else would be a bug in present
optimizers in the early pipeline - they are all designed to be
roughly independent on each other and _not_ expose new
opportunities by iteration).  Thus - testcases?

Thanks,
Richard.

> [*] 
> https://docs.google.com/spreadsheet/ccc?key=0AvK0Y-Pgj7bNdFBQMEJ6d3laeFdvdk9lQ1p0LUFkVFE&hl=en_US
>
> Thank you,
>
> --
> Maxim Kuvyrkov
> CodeSourcery / Mentor Graphics
>
>
>

Re: [PATCH] Add capability to run several iterations of early optimizations

Reply via email to