Hello Christophe, On Fri, 8 Jul 2022 at 19:32, Christophe Leroy <christophe.le...@csgroup.eu> wrote: > > This series applies on top of the series v3 "objtool: Enable and > implement --mcount option on powerpc" [1] rebased on powerpc-next branch > > A few modifications are done to core parts to enable powerpc > implementation: > - R_X86_64_PC32 is abstracted to R_REL32 so that it can then be > redefined as R_PPC_REL32. > - A call to static_call_init() is added to start_kernel() to avoid > every architecture to have to call it > - Trampoline address is provided to arch_static_call_transform() even > when setting a site to fallback on a call to the trampoline when the > target is too far. > > [1] > https://lore.kernel.org/lkml/70b6d08d-aced-7f4e-b958-a3c7ae1a9...@csgroup.eu/T/#rb3a073c54aba563a135fba891e0c34c46e47beef > > Christophe Leroy (7): > powerpc: Add missing asm/asm.h for objtool > objtool/powerpc: Activate objtool on PPC32 > objtool: Add architecture specific R_REL32 macro > objtool/powerpc: Add necessary support for inline static calls > init: Call static_call_init() from start_kernel() > static_call_inline: Provide trampoline address when updating sites > powerpc/static_call: Implement inline static calls >
Could you quantify the performance gains of moving from out-of-line, patched tail-call branch instructions to full-fledged inline static calls? On x86, the retpoline problem makes this glaringly obvious, but on other architectures, the complexity of supporting this model may outweigh the performance advantages.