On Fri, Jan 25, 2013 at 08:11:39PM +0100, Oleg Nesterov wrote: > On 01/25, Oleg Nesterov wrote: > > > > > +int percpu_ref_kill(struct percpu_ref *ref) > > > +{ > > > ... > > > + if (status == PCPU_REF_PTR) { > > > + unsigned count = 0, cpu; > > > + > > > + synchronize_rcu(); > > > + > > > + for_each_possible_cpu(cpu) > > > + count += *per_cpu_ptr((unsigned __percpu *) pcpu_count, > > > cpu); > > > + > > > + pr_debug("global %lli pcpu %i", > > > + atomic64_read(&ref->count) & PCPU_COUNT_MASK, > > > + (int) count); > > > + > > > + atomic64_add((int) count, &ref->count); > > > + smp_wmb(); > > > + /* Between setting global count and setting PCPU_REF_DEAD */ > > > + ref->pcpu_count = PCPU_REF_DEAD; > > > > The coment explains what the code does, but not why ;) > > > > I guess this is for percpu_ref_put(), and this wmb() pairs with implicit > > mb() implied by atomic64_dec_return(). > > Hmm. Most probably I missed something, but it seems we need another > synchronize_rcu() _after_ we set PCPU_REF_DEAD.
Yeah, correct - documentation bug. I originally had the synchronize_rcu() there, but this is called by exit_aio() -> kill_ioctx() when we're killing a process, and Ben LaHaise pointed out that was less than ideal if a process had a bunch of ioctxs - so I left the second one out there so the caller would have the option of using call_rcu() instead. > To simplify, suppose that percpu_ref_put() is never called directly but > we have > > void put_and_dsetroy(...) > { > if (percpu_ref_put(...)) > destroy(...); > } > > Suppose that ref->count == 2 after atomic64_add() above. IOW, we have > a "master" reference for _kill() and someone else did _get. > > So the caller does > > percpu_ref_kill(); > put_and_dsetroy(); > > And this can race with another holder which drops the last reference, > its put_and_dsetroy() can see PCPU_REF_DYING and return false. > > Or I misunderstood the code/interface? Nope, nailed it :) That should _definitely_ be in the documentation. Actually - I think it'd be better to have the default percpu_ref_kill() do the second synchronize_rcu(), and have an unsafe version that skips it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/