On Thu, 22 Aug 2024 10:32:02 -0400 Steven Rostedt <rost...@goodmis.org> wrote:
> > Yeah, it seems there might be multiple bugs in the user workload > > handling, the other NULL pointer dereference and refcount warning > > above might be related (but I have yet to reproduce it on an upstream > > kernel). I'm also going to look at the code and will post any findings > > here. > > Yes that is the second bug and it is related to the that this addresses. There's nothing protecting the clearing of the kthreads and calling put_task_struct(). Here's the fix to the second bug: diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c index 66a871553d4a..53de719f35cb 100644 --- a/kernel/trace/trace_osnoise.c +++ b/kernel/trace/trace_osnoise.c @@ -2106,7 +2106,9 @@ static int osnoise_cpu_init(unsigned int cpu) */ static int osnoise_cpu_die(unsigned int cpu) { + mutex_lock(&interface_lock); stop_kthread(cpu); + mutex_unlock(&interface_lock); return 0; } @@ -2239,8 +2241,11 @@ static ssize_t osnoise_options_write(struct file *filp, const char __user *ubuf, */ mutex_lock(&trace_types_lock); running = osnoise_has_registered_instances(); - if (running) + if (running) { + mutex_lock(&interface_lock); stop_per_cpu_kthreads(); + mutex_unlock(&interface_lock); + } mutex_lock(&interface_lock); /* @@ -2355,8 +2360,11 @@ osnoise_cpus_write(struct file *filp, const char __user *ubuf, size_t count, */ mutex_lock(&trace_types_lock); running = osnoise_has_registered_instances(); - if (running) + if (running) { + mutex_lock(&interface_lock); stop_per_cpu_kthreads(); + mutex_unlock(&interface_lock); + } mutex_lock(&interface_lock); /* @@ -2951,7 +2960,9 @@ static void osnoise_workload_stop(void) */ barrier(); + mutex_lock(&interface_lock); stop_per_cpu_kthreads(); + mutex_unlock(&interface_lock); osnoise_unhook_events(); } With both of these fixes, the bug goes away. I'll add this fix (after enabling lockdep and making sure I didn't screw up the locking). Can you resend this patch with just not calling cancel if kthread is NULL. No need to exit out early. I still like to make sure the clean up happens, and not assume it will already be done. -- Steve