On Wed, 20 Dec 2023 13:06:06 +0000
Vincent Donnefort <vdonnef...@google.com> wrote:

> > @@ -771,10 +772,20 @@ static void rb_update_meta_page(struct 
> > ring_buffer_per_cpu *cpu_buffer)
> >  static void rb_wake_up_waiters(struct irq_work *work)
> >  {
> >     struct rb_irq_work *rbwork = container_of(work, struct rb_irq_work, 
> > work);
> > -   struct ring_buffer_per_cpu *cpu_buffer =
> > -           container_of(rbwork, struct ring_buffer_per_cpu, irq_work);
> > +   struct ring_buffer_per_cpu *cpu_buffer;
> > +   struct trace_buffer *buffer;
> > +   int cpu;
> >  
> > -   rb_update_meta_page(cpu_buffer);
> > +   if (rbwork->is_cpu_buffer) {
> > +           cpu_buffer = container_of(rbwork, struct ring_buffer_per_cpu, 
> > irq_work);
> > +           rb_update_meta_page(cpu_buffer);
> > +   } else {
> > +           buffer = container_of(rbwork, struct trace_buffer, irq_work);
> > +           for_each_buffer_cpu(buffer, cpu) {
> > +                   cpu_buffer = buffer->buffers[cpu];
> > +                   rb_update_meta_page(cpu_buffer);
> > +           }
> > +   }  
> 
> Arg, somehow never reproduced the problem :-\. I suppose you need to cat
> trace/trace_pipe and mmap(trace/cpuX/trace_pipe) at the same time?

It triggered as soon as I ran "trace-cmd start -e sched_switch"

In other words, it broke the non mmap case. This function gets called for
both the buffer and cpu_buffer irq_work entries. You added the
container_of() to get access to cpu_buffer, when the rbwork could also be
for the main buffer too. The main buffer has no meta page, and it triggered
a NULL pointer dereference, as "cpu_buffer->mapped" returned true (because
it was on something of the buffer structure that wasn't zero), and then here:

        if (cpu_buffer->mapped) {
                WRITE_ONCE(cpu_buffer->meta_page->reader.read, 0);

It dereferenced cpu_buffer->meta_page->reader

which is only God knows what!

> 
> Updating the meta-page is only useful if the reader we are waking up is a
> user-space one, which would only happen with the cpu_buffer version of this
> function. We could limit the update of the meta_page only to this case?

I rather not add another irq_work entry. This workaround should be good
enough.

Thanks,

-- Steve

Reply via email to