On Wed, Apr 23, 2025 at 05:17:31PM +0200, Benjamin Berg wrote: > Hi, > > On Wed, 2025-01-08 at 10:04 +0100, Sebastian Andrzej Siewior wrote: > > __module_text_address() can be invoked within a RCU section, there is no > > requirement to have preemption disabled. > > > > Replace the preempt_disable() section around __module_text_address() > > with RCU. > > Unfortunately, this patch causes a performance regression for us. The > trouble is that we enable kmemleak and run trace-cmd so a lot of stack > traces need to be collected. Obviously, we also have lockdep enabled. > > Now, combine this with the UML stack dumping code calling into > __kernel_text_address a lot[1] and it really has a relevant performance > impact. I saw the kernel spending 40% of its own CPU time just on the > lock in is_module_text_address. > > Maybe kernel_text_address should leave the RCU handling to the caller > and assume that the RCU read lock is already taken? > > Benjamin > > [1] The UM arch dump_stack function reads every "unsigned long" on the > stack and tests it using __kernel_text_address.
Use of a single guard(rcu)() is regressing performance? Interesting and quite unexpected. That said, tiven the amount of debug you have enabled, I am not so sure that people are going to be all that excited about a further performance regression. But is this regression due to the cleanup hook that guard(rcu)() registers? If so, please feel free to try using rcu_read_lock() and rcu_read_unlock() instead. I would be surprised if this makes a difference, but then again, your initial regression report also comes as a surprise, so... Another way to reduce guard(rcu)() overhead is to build your kernel with CONFIG_PREEMPT_NONE=y. Not so good for real-time response, but then again, neither are your debug options. Thanx, Paul > > Acked-by: Peter Zijlstra (Intel) <pet...@infradead.org> > > Signed-off-by: Sebastian Andrzej Siewior <bige...@linutronix.de> > > --- > > kernel/module/main.c | 16 +++++----------- > > 1 file changed, 5 insertions(+), 11 deletions(-) > > > > diff --git a/kernel/module/main.c b/kernel/module/main.c > > index 80877741ac7e5..6a99076146cbc 100644 > > --- a/kernel/module/main.c > > +++ b/kernel/module/main.c > > @@ -823,13 +823,12 @@ void symbol_put_addr(void *addr) > > > > /* > > * Even though we hold a reference on the module; we still > > need to > > - * disable preemption in order to safely traverse the data > > structure. > > + * RCU read section in order to safely traverse the data > > structure. > > */ > > - preempt_disable(); > > + guard(rcu)(); > > modaddr = __module_text_address(a); > > BUG_ON(!modaddr); > > module_put(modaddr); > > - preempt_enable(); > > } > > EXPORT_SYMBOL_GPL(symbol_put_addr); > > > > @@ -3694,20 +3693,15 @@ struct module *__module_address(unsigned long > > addr) > > */ > > bool is_module_text_address(unsigned long addr) > > { > > - bool ret; > > - > > - preempt_disable(); > > - ret = __module_text_address(addr) != NULL; > > - preempt_enable(); > > - > > - return ret; > > + guard(rcu)(); > > + return __module_text_address(addr) != NULL; > > } > > > > /** > > * __module_text_address() - get the module whose code contains an > > address. > > * @addr: the address. > > * > > - * Must be called with preempt disabled or module mutex held so that > > + * Must be called within RCU read section or module mutex held so > > that > > * module doesn't get freed during this. > > */ > > struct module *__module_text_address(unsigned long addr) >