On Tue, Oct 10, 2017 at 4:46 PM, Rik van Riel <r...@surriel.com> wrote: > On Tue, 2017-10-10 at 13:35 +0100, Gargi Sharma wrote: >> On Tue, Oct 10, 2017 at 12:50 PM, Oleg Nesterov <o...@redhat.com> >> wrote: >> > On 10/09, Andrew Morton wrote: >> > > >> > > > @@ -240,17 +230,11 @@ void zap_pid_ns_processes(struct >> > > > pid_namespace *pid_ns) >> > > > * >> > > > */ >> > > > read_lock(&tasklist_lock); >> > > > - nr = next_pidmap(pid_ns, 1); >> > > > - while (nr > 0) { >> > > > - rcu_read_lock(); >> > > > - >> > > > - task = pid_task(find_vpid(nr), PIDTYPE_PID); >> > > > + nr = 2; >> > > > + idr_for_each_entry_continue(&pid_ns->idr, pid, nr) { >> > > > + task = pid_task(pid, PIDTYPE_PID); >> > > > if (task && !__fatal_signal_pending(task)) >> > > > send_sig_info(SIGKILL, SEND_SIG_FORCED, >> > > > task); >> > > > - >> > > > - rcu_read_unlock(); >> > > > - >> > > > - nr = next_pidmap(pid_ns, nr); >> > > > } >> > > > read_unlock(&tasklist_lock); >> > > >> > > Especially here. I don't think pidmap_lock is held. Is that IDR >> > > iteration safe? >> > >> > Yes, this doesn't look right, we need rcu_read_lock() or >> > pidmap_lock. >> > >> > And, we also need rcu_read_lock() for another reason, to protect >> > "struct pid". >> >> Ah, I missed this. From what I understood idr_for_each_entry_continue >> should be safe because calls idr_get_next which in turn calls >> radix_tree_iter_find to find the next populated entry in the idr. If >> the pid that you are looking up the task for is deleted, task will >> get >> a NULL from pid_task and no signal to kill will be sent. >> > >> > Gargi, I suggested to use idr_for_each_entry_continue(), but now I >> > am wondering >> > if we should use idr_for_each() instead. IIUC this would be a bit >> > faster? Not >> > that I think this is really important... >> >> I can run benchmarks with idr_for_each to see how much speed up is >> achieved and then we can go with whatever we think is better. How >> does >> that sounds? > > I suspect this code will not be a hot path in any > conceivable "kill off hundreds of containers" > benchmark, since the overhead of having all of the > tasks in those containers exit will dwarf any > changes in this code. > > Simply making it safe for fully preemptible > kernels by adding rcu_read_lock() around the > section is what matters the most. > > The choice between idr_for_each_entry_continue() > and idr_for_each() is dictated more by which > of the two results in easier to read code.
I have listed down the code for both idr_for_each and idr_for_each_entry. IMHO idr_for_each_entry is easier to read, but YMMV. :) void kill_task(int id, void *ptr, void *data) { struct *pid = ptr; struct task_struct *task = pid_task(pid, PIDTYPE_PID); if (task && !__fatal_signal_pending(task)) send_sig_info(SIGKILL, SEND_SIG_FORCED, task); } rcu_read_unlock(); idr_for_each(&pid_ns->idr, &kill_task, NULL); rcu_read_unlock(); VS idr_for_each_entry_continue(&pid_ns->idr, pid, nr) { task = pid_task(pid, PIDTYPE_PID); if (task && !__fatal_signal_pending(task)) send_sig_info(SIGKILL, SEND_SIG_FORCED, task); } Thanks! Gargi > > -- > All rights reversed