On Thu, Feb 25, 2021 at 1:33 PM Borislav Petkov <b...@alien8.de> wrote: > > On Thu, Feb 25, 2021 at 12:31:33PM -0800, Nick Desaulniers wrote: > > (full thread: > > https://lore.kernel.org/lkml/20210225112247.2240389-1-a...@kernel.org/) > > I suspect in this specific case, "Interprocedural Sparse Conditional > > Constant Propagation" sees the calls to the same fn with different > > constants, propagates those down creating two specialized versions of > > the callee (so they are distinct functions now), inlines those into > > get_smp_config()/early_get_smp_config(), then there's too many callers > > of those in a single TU where inlining would cause excessive code > > bloat. > > Well, there's exactly one caller of get_smp_config - that's setup_arch(). > early_get_smp_config() gets called also exactly once in amd_numa_init(). > > Now, with my simplistic approach, I can replace the lines at those call > sites by hand with the > > x86_init.mpparse.get_smp_config(<arg>); > > call. So those become exactly one function call. I still don't see how > that can be done any differently, frankly. > > But apparently the cost model has decided that this is not inlineable. > Maybe because that function ptr is assigned at boot time and that > somehow gets the cost model to give it a very high (or low) value. Or > maybe because the wrappers are calling through a variable - the x86_init > thing - which is in a different section and that confuses the inliner. > Or whatever - totally speculating here.
The config that reproduces it wasn't shared here; I wouldn't be surprised if this was found via randconfig that enabled some config that led to excessive code bloat somewhere somehow. > > And this brings me to my point - you can't expect people to do all that > crazy dance of compiler introspection and understand cost models and > compiler optimization just to fix stuff like that. Oh, I don't expect everyone to; just leaving breadcrumbs showing other people on thread how to fish. ;) > > Now, imagine we "fix" this to clang-13's inliner's satisfaction. Now > imagine too that gcc Version Next changes their inliner and that inliner > says that that "fix" is wrong, for whatever reason, bottom up, top down, > whatever. Do you feel the annoyance all around? Yes, mutually unsatisfiable cases are painful, but I don't think that's what's going on here. > > And since, as you say, there are no silver bullets here, I think for > cases like that we'll need a "I know what I'm doing Mr. Compiler, TYVM, > even if your cost model says otherwise" facility. And in this case I > still think __always_inline is correct. Sure, it doesn't really matter to me which way this is fixed. I personally prefer placing functions in the correct sections and letting the compiler be flexible, since if all of this is to satisfy some randconfig then __always_inline is making a decision for all configs, but perhaps it doesn't matter. -- Thanks, ~Nick Desaulniers