On Mon, 2017-10-02 at 09:35 -0400, Rik van Riel wrote: > On Wed, 2017-09-27 at 17:45 +0200, Oleg Nesterov wrote: > > On 09/27, Gargi Sharma wrote: > > > > > > -#define find_next_offset(map, off) > > > > > > \ > > > - find_next_zero_bit((map)->page, BITS_PER_PAGE, > > > off) > > > - > > > > this should go into the previous patch, but this is minor... > > > > > @@ -208,12 +200,10 @@ struct pid *alloc_pid(struct pid_namespace > > > *ns) > > > > > > upid = pid->numbers + ns->level; > > > spin_lock_irq(&pidmap_lock); > > > - if (!(ns->nr_hashed & PIDNS_HASH_ADDING)) > > > + if (!(ns->pid_allocated & PIDNS_ADDING)) > > > goto out_unlock; > > > for ( ; upid >= pid->numbers; --upid) { > > > - hlist_add_head_rcu(&upid->pid_chain, > > > - &pid_hash[pid_hashfn(upid->nr, > > > upid->ns)]); > > > - upid->ns->nr_hashed++; > > > + upid->ns->pid_allocated++; > > > > No, this is wrong. > > > > It is too late to check PIDNS_HASH_ADDING/PIDNS_ADDING and > > increment > > pid_allocated, > > once we call idr_alloc_cyclic() this pid is already "hashed" in > > that > > it can be found > > by find_pid_ns() with this patch applied. > > > > And of course, it is too late to do atomic_set(&pid->count, 1) and > > initialize > > pid->tasks[type] lists by the same reason. > > Hi Oleg, > > Gargi and I are looking at that code, and trying to figure out > exactly what needs to be done to make all of this correct. > > We are thinking something along these lines:
.... and we looked at the code some more. Some of the way the code is laid out currently looks like an artifact of history. Lets try this again: 1) pid = kmem_cache_alloc(...) 2) get_pid_ns(ns); atomic_set(&pid->count, 1); for (type = 0; ...) INIT_HLIST(...) > 3) First, check if this is a new namespace (PIDNS_ADDING), and > do the call to pid_ns_prepare_proc, before we even call idr_alloc. > Maybe something like: > > if (unlikely(ns->nr_allocated == PIDNS_ADDING)) { > if (pid_ns_prepare_proc(ns)) { > disable_pid_allocations(ns); > goto out_free_ns; > } > > 4) With pid_ns_prepare_proc out of the way, we can put all the code > from below where the call to pid_ns_prepare_proc is now (except > error handing) into the main loop of pid allocation, so we can > do all that stuff under the pidmap_lock: > > for (i = ns->level; i >= 0; i--) { > ... > idr_alloc_cyclic(...) > ns->nr_allocated++; > ... > } > > Would that resolve your objection, or are we barking up the wrong > tree? Now the per-pid struct stuff is done once, before the loop, and the loop deals only with allocated numbers inside each parent namespace. The error path would unwind stuff done earlier. -- All Rights Reversed.
signature.asc
Description: This is a digitally signed message part