https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #20 from Richard Sandiford <rsandifo at gcc dot gnu.org> --- (In reply to Jan Hubicka from comment #18) > I am trying to understand how useful this is. I am basically worried > about two things > 1) we have other optimization passes that behave differently at -O2 and > -O3 (vectorizer, unrolling etc.) and I think we may want to have > more. We also have -Os and -O1. > > So perhaps we want kind of more systmatic solution. We already have > -fvect-cost-model that is kind of vectorizer version of the proposed > inliner option. Yeah, like you say, -fvect-cost-model solves this problem for the vectoriser. There we have an intermediate setting (cheap) that isn't the default for any -O<n>. But the proposal isn't to do something like that for inlining. >From a quick grep, it looked like there are three places that distinguish directly between -O2 and -O3: (1) a couple of expansions in builtins.cc (2) cunrolli, which uses it to decide whether unrolling can increase size (3) number_of_iterations_exit_assumptions, where it decides whether outer evolutions should be taken into account. (Was surprised by this one.) If someone does want to control those things separately in future, I think the defined way of doing that would be to add a -f flag. So it feels like this patch is moving in the same direction that we'd expect for the rest. > 2) inliner is already quite painful to tune. Especially since > one really needs to benchmark packages significantly bigger than > SPECs which tends to be bit hard to set up and benchmark > meaningfully. I usually do at least Firefox and clang where the > first is always quite some work to get working well with latest > GCC. We SUSE's LNT we also run "C++ behchmarks" which were > initially collected as kind of inliner tests with higher > abstraction penalty (tramp3d etc.). > > For many years I benchmarked primarily -O3 and -O3 + profile > feedbcak on x86-64 only with ocassional look at -O2 and -Os > behaviour which were generally more stable. > I also tested other targets (poer and aarch64) but just > sporadically, which is not good. > > After GCC5 I doubled testing to include both lto/non-lto variant. > Since GCC10 -O2 started to envolve and needed re-testing too > (lto/nonlto). One metric I know I ought to tune is -O2 -flto and > FDO which used to be essentially -O3 before the optimization level > --params were introduced, but now -O2 + FDO inlining is more > conservative which hurts, for example, profiledbootstrapped GCC. > > So naturally I am bit worried to introduce even more combinations > that needs testing and maintenance. If we add user friendly way to > tweak this, we also make a promise to keep it sane. Yeah, I agree we don't want to increase the number of supported settings. My understanding of the new option is that it should, by definition, do exactly what -O3 does, now and in the future. So there would be no extra tuning burden. Whatever is right for -O3 in future is also right for the new option.