On Tue, Aug 25, 2020 at 10:59:54PM +0900, Masami Hiramatsu wrote: > On Tue, 25 Aug 2020 15:30:05 +0200 > pet...@infradead.org wrote: > > > On Tue, Aug 25, 2020 at 10:15:55PM +0900, Masami Hiramatsu wrote: > > > > > > damn... one last problem is dangling instances.. so close. > > > > We can apparently unregister a kretprobe while there's still active > > > > kretprobe_instance's out referencing it. > > > > > > Yeah, kretprobe already provided the per-instance data (as far as > > > I know, only systemtap depends on it). We need to provide it for > > > such users. > > > But if we only have one lock, we can avoid checking NMI because > > > we can check the recursion with trylock. It is needed only if the > > > kretprobe uses per-instance data. Or we can just pass a dummy > > > instance on the stack. > > > > I think it is true in general, you can unregister a rp while tasks are > > preempted. > > Would you mean the kretprobe handler (or trampoline handler) will be > preempted? All kprobes (including kretprobe) handler is running in > non-preemptive state, so it shouldn't happen...
I was thinking about something like: for_each_process_thread(p, t) { if (!t->kretprobe_instances.first) continue; again: if (try_invoke_on_locked_down_task(t, unhook_rp_inst, tp)) continue; smp_function_call(...); if (!done) goto again; } So then for each task that has a kretprobe stack, we iterate the stack and set ri->rp = NULL, remotely when the task isn't running, locally if the task is running. I just need to change the semantics of try_invoke_on_locked_down_task() a bit -- they're a tad weird atm. > > Anyway,. I think I have a solution, just need to talk to paulmck for a > > bit. > > Ah, you mentioned that the removing the kfree() from the trampline > handler? I think we can make an rcu callback which will kfree() the > given instances. (If it works in NMI) Yes, calling kfree() from the trampoline seems dodgy at best. When !ri->rp rcu_free() is a good option.