On 25.07.24 21:05, Steven Rostedt wrote: > On Thu, 25 Jul 2024 20:12:33 +0200 > Mathias Krause <mini...@grsecurity.net> wrote: >>>> @@ -973,6 +975,11 @@ size_t copy_nofault(void *addr, size_t bytes, struct >>>> iov_iter *i) >>>> static struct list_head *user_event_get_fields(struct trace_event_call >>>> *call) >>>> { >>>> struct user_event *user = (struct user_event *)call->data; >> >> Dereferencing a potentially free'd object, so 'user' is now "random" data. > > This is the callback function of user->call.get_fields. > > That is, we have: > > user->call.get_fields = user_event_get_fields; > > And the f_start() code eventually calls trace_get_fields() that has (from a > previous email in this thread). > > trace_get_fields(struct trace_event_call *event_call) > { > if (!event_call->class->get_fields) > return &event_call->class->fields; > return event_call->class->get_fields(event_call); > }
Right. But the point is, that 'event_call' is really some '&user->call'. With 'user' being free'd memory, what gives? Dereferencing 'event_call' is UB, so this function is doomed to fail because it cannot know if its only argument points to still valid memory or not. And that's the core issue -- calling that function for an object that's long gone -- the missing refcounting I hinted at in my first Email. > > Where it calls the ->class->get_fields(event_call); > > that calls this function. By setting: > > user->call.get_fields = NULL; > > this will never get called and no random data will be accessed. As 'user' is free'd or soon-to-be-free'd memory, that's a non-starter. > > That said, I was talking with Beau, we concluded that this shouldn't be the > responsibility of the user of event call, and should be cleaned up by the > event system. > > Here's the proper fix: > > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c > index 6ef29eba90ce..3a2d2ff1625b 100644 > --- a/kernel/trace/trace_events.c > +++ b/kernel/trace/trace_events.c > @@ -3140,8 +3140,10 @@ EXPORT_SYMBOL_GPL(trace_add_event_call); > */ > static void __trace_remove_event_call(struct trace_event_call *call) > { > + lockdep_assert_held(&event_mutex); > event_remove(call); > trace_destroy_fields(call); > + call->get_fields = NULL; > free_event_filter(call->filter); > call->filter = NULL; > } > > Can you try it out? I can try but I don't think that's the proper fix for above reasons, I'm sorry. Thanks, Mathias