On 09/06, Davidlohr Bueso wrote: > > Here tasklist_lock does not protect anything other than the list > against concurrent fork/exit. And considering that the whole thing > is capped by FTRACE_RETSTACK_ALLOC_SIZE (32), it should not be a > problem to have a pontentially stale, yet stable, list. The task cannot > go away either, so we don't risk racing with ftrace_graph_exit_task() > which clears the retstack.
I don't understand this code but I think you right, tasklist_lock buys nothing. Afaics, with or without this change alloc_retstack_tasklist() can race with copy_process() and miss the new child; ftrace_graph_init_task() can't help, ftrace_graph_active can be set right after the check and for_each_process_thread() can't see the new process yet. This can't race with ftrace_graph_exit_task(), it is called after the full gp pass. But this function looks very confusing to me, I don't understand the barrier and the "NULL must become visible to IRQs before we free it" comment. Looks like, ftrace_graph_exit_task() was called by the exiting task in the past? Indeed, see 65afa5e603d50 ("tracing/function-return-tracer: free the return stack on free_task()"). I think it makes sense to simplify this function now, it can simply do kfree(t->ret_stack) and nothing more. ACK, but ... > @@ -387,8 +387,8 @@ static int alloc_retstack_tasklist(struct > ftrace_ret_stack **ret_stack_list) > } > } > > - read_lock(&tasklist_lock); then you should probably rename alloc_retstack_tasklist() ? Oleg.