On Fri, Mar 6, 2026 at 6:00 PM David Laight <[email protected]> wrote: > > On Fri, 6 Mar 2026 10:22:11 +0800 > Yafang Shao <[email protected]> wrote: > > > On Thu, Mar 5, 2026 at 9:20 PM Steven Rostedt <[email protected]> wrote: > > > > > > On Thu, 5 Mar 2026 13:40:27 +0800 > > > Yafang Shao <[email protected]> wrote: > > > > > > > Exactly. ftrace is intended for debugging and should not significantly > > > > impact real workloads. Therefore, it's reasonable to make it sleep if > > > > it cannot acquire the lock immediately, rather than spinning and > > > > consuming CPU cycles. > > > > > > Actually, ftrace is more than just debugging. It is the infrastructure for > > > live kernel patching as well. > > > > good to know. > > > > > > > > > > > > > > > > > > > BTW, you should expand the commit log of patch 1 to include the > > > > > rationale of why we should add this feature to mutex as the > > > > > information > > > > > in the cover letter won't get included in the git log if this patch > > > > > series is merged. You should also elaborate in comment on under what > > > > > conditions should this this new mutex API be used. > > > > > > > > Sure. I will update it. > > > > > > > > BTW, these issues are notably hard to find. I suspect there are other > > > > locks out there with the same problem. > > > > > > As I mentioned, I'm not against the change. I just want to make sure the > > > rationale is strong enough to make the change. > > > > > > One thing that should be modified with your patch is the name. "nospin" > > > references the implementation of the mutex. Instead it should be called > > > something like: "noncritical" or "slowpath" stating that the grabbing of > > > this mutex is not of a critical section. > > > > > > Maybe an entirely new interface should be defined: > > > > > > > > > struct slow_mutex; > > > > Is it necessary to define a new structure for this slow mutex? We > > could simply reuse the existing struct mutex instead. Alternatively, > > should we add some new flags to this slow_mutex for debugging > > purposes? > > > > > > > > slow_mutex_lock() > > > slow_mutex_unlock() > > > > These two APIs appear sufficient to handle this use case. > > Don't semaphores still exist?
While semaphores may present similar challenges, I'm not currently aware of specific instances that share this exact issue. Should we encounter any problematic semaphores in production workloads, we can address them at that time. > IIRC they always sleep. > > Although I wonder if the mutex need to be held for as long at it is. > ISTR one of the tracebacks was one the 'address to name' lookup, > that code will be slow. > Since the mutex can't be held across the multiple reads that are done > to read the full list of tracepoints it must surely be possible to > release it across the name lookup? That's a great point, though I'm not entirely certain at the moment. Perhaps Steven can provide further insight. -- Regards Yafang
