https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531

--- Comment #20 from Richard Sandiford <rsandifo at gcc dot gnu.org> ---
(In reply to Jan Hubicka from comment #18)
> I am trying to understand how useful this is.  I am basically worried
> about two things
>  1) we have other optimization passes that behave differently at -O2 and
>     -O3 (vectorizer, unrolling etc.) and I think we may want to have
>     more. We also have -Os and -O1.
> 
>     So perhaps we want kind of more systmatic solution. We already have
>     -fvect-cost-model that is kind of vectorizer version of the proposed
>     inliner option.
Yeah, like you say, -fvect-cost-model solves this problem for the vectoriser.
There we have an intermediate setting (cheap) that isn't the default for any
-O<n>.  But the proposal isn't to do something like that for inlining.

>From a quick grep, it looked like there are three places that distinguish
directly between -O2 and -O3:

(1) a couple of expansions in builtins.cc

(2) cunrolli, which uses it to decide whether unrolling can increase size

(3) number_of_iterations_exit_assumptions, where it decides whether outer
    evolutions should be taken into account.  (Was surprised by this one.)

If someone does want to control those things separately in future, I think
the defined way of doing that would be to add a -f flag.  So it feels like
this patch is moving in the same direction that we'd expect for the rest.

>  2) inliner is already quite painful to tune. Especially since 
>      one really needs to benchmark packages significantly bigger than
>      SPECs which tends to be bit hard to set up and benchmark
>      meaningfully. I usually do at least Firefox and clang where the
>      first is always quite some work to get working well with latest
>      GCC. We SUSE's LNT we also run "C++ behchmarks" which were
>      initially collected as kind of inliner tests with higher
>      abstraction penalty (tramp3d etc.).
> 
>      For many years I benchmarked primarily -O3 and -O3 + profile
>      feedbcak on x86-64 only with ocassional look at -O2 and -Os
>      behaviour which were generally more stable.
>      I also tested other targets (poer and aarch64) but just
>      sporadically, which is not good.
> 
>      After GCC5 I doubled testing to include both lto/non-lto variant.
>      Since GCC10 -O2 started to envolve and needed re-testing too
>      (lto/nonlto). One metric I know I ought to tune is -O2 -flto and
>      FDO which used to be essentially -O3 before the optimization level
>      --params were introduced, but now -O2 + FDO inlining is more
>      conservative which hurts, for example, profiledbootstrapped GCC.
> 
>      So naturally I am bit worried to introduce even more combinations
>      that needs testing and maintenance.  If we add user friendly way to
>      tweak this, we also make a promise to keep it sane.
Yeah, I agree we don't want to increase the number of supported settings.
My understanding of the new option is that it should, by definition,
do exactly what -O3 does, now and in the future.  So there would be
no extra tuning burden.  Whatever is right for -O3 in future is also
right for the new option.

Reply via email to