On Thu, 2026-05-21 at 18:48 -0500, Crystal Wood wrote:
> On Wed, 2026-05-20 at 16:28 -0400, Steven Rostedt wrote:
> > [ Replying to Sashiko:
> > https://sashiko.dev/?list=org.kernel.vger.linux-trace-kernel#/patchset/20260511223035.1475676-1-crwood%40redhat.com
> > ]
> >
> > > > diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
> > > > index 75678053b21c5..2be188768ab42 100644
> > > > --- a/kernel/trace/trace_osnoise.c
> > > > +++ b/kernel/trace/trace_osnoise.c
> > > > @@ -83,6 +83,22 @@ struct osnoise_instance {
> > > >
> > > > static struct list_head osnoise_instances;
> > > >
> > > > +static void osnoise_print(const char *fmt, ...)
> > > > +{
> > > > + struct osnoise_instance *inst;
> > > > + struct trace_array *tr;
> > > > + va_list ap;
> > > > +
> > > > + rcu_read_lock();
> > > > + list_for_each_entry_rcu(inst, &osnoise_instances, list) {
> > > > + tr = inst->tr;
> > > > + va_start(ap, fmt);
> > > > + trace_array_vprintk(tr, _RET_IP_, fmt, ap);
> >
> > > Does this code create a use-after-free on the trace array if an instance
> > > is
> > > removed concurrently?
>
> If so, it was already an issue with osnoise_taint(),
> osnoise_stop_tracing(), osnoise_stop_exception(), etc. Wouldn't be
> surprising, as this file has a number of other synchronization issues as
> well.
It looks like it's actually also an issue in a bunch more places such as
record_osnoise_sample(), timerlat_dump_stack(), etc.
> > > When a user deletes a trace instance via rmdir, the unregister function
> > > removes the instance from the list using list_del_rcu(). However, the
> > > removal
> > > routine does not appear to wait for an RCU grace period before freeing the
> > > trace array itself.
> > > Could a concurrent execution of this loop inside the rcu_read_lock()
> > > section
> > > still access the unlinked instance, read the freed inst->tr, and pass it
> > > to
> > > trace_array_vprintk()? This appears to be an existing issue, but it still
> > > affects the loop here.
> >
> > Hmm, this is interesting. osnoise keeps track of its own instances via a
> > osnoise_instances list. But it only use kfree_rcu() to free the list
> > descriptor but doesn't take care of the tr being freed before hand!
> >
> > Something like this could work [not even compiled]
> >
> > diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
> > index 75678053b21c..bda1e0e0d2e1 100644
> > --- a/kernel/trace/trace_osnoise.c
> > +++ b/kernel/trace/trace_osnoise.c
> > @@ -476,8 +476,11 @@ static void print_osnoise_headers(struct seq_file *s)
> >
> > \
> > rcu_read_lock();
> > \
> > list_for_each_entry_rcu(inst, &osnoise_instances, list) {
> > \
> > + if (trace_array_get(inst->tr) < 0)
> > \
> > + continue;
> > \
> > buffer = inst->tr->array_buffer.buffer;
> > \
> > trace_array_printk_buf(buffer, _THIS_IP_, msg);
> > \
> > + trace_array_put(inst->tr);
> > \
> > }
> > \
> > rcu_read_unlock();
> > \
> > osnoise_data.tainted = true;
> > \
>
> OK, I'll prepare a v3.
Many osnoise_taint() callers, as well as timerlat_dump_stack(), can have
preemption disabled, so the mutex in trace_array_get() won't work.
What is the intended way for a tracer to record to all of its
instances? I tried looking at other tracers that allow instances, but
it seems that most of them only allow one instance, apart from
trace_function/trace_function_graph that are driven by a callback
mechanism that doesn't fit here, and that made my brain hurt when I
dove into the code to try to figure out how it ensures a valid tr.
We could have osnoise_unregister_instance() set inst->tr = NULL
under a raw lock, and then require users to hold the raw lock when
manipulating inst->tr (which can't go away until ->stop() completes),
skipping any that are NULL.
It's not great that we lose the ability to do things with tr that are
incompatible with a raw lock, but I don't know how to fix that without
something like changing the tr refcount mechanism to allow an atomic
refcount increase on a known-valid-for-now tr. It looks like all the
current users of this list are OK with a raw lock.
We could go even further and simplify by un-RCUing the list, requiring
that the raw lock be held over traversal -- I'm not thrilled at the
idea of letting userspace create unbounded instances that are traversed
with preemption disabled, but as noted, we already do that in some places.
In any case, this patch is just moving the code around, not introducing
the problem, so I hope that whatever synchronization overhaul this file
requires (which goes well beyond this one issue) can wait for followup
patches.
-Crystal