On Thu, Jul 14, 2016 at 06:45:47PM +0200, Peter Zijlstra wrote: > On Thu, Jul 14, 2016 at 09:23:55AM -0700, Paul E. McKenney wrote: > > Hmmm... How does this handle the following sequence of events for > > the case where we are not biased towards the reader? > > > > o The per-CPU rwsem is set up with RCU_NONE and readers_slow > > (as opposed to readers_block). The rcu_sync ->gp_state is > > GP_PENDING, which means that rcu_sync_is_idle() will always > > return true. > > /false/, rcu_sync_is_idle() will always be false, to force us into the > slowpath.
Ah, got it... > > o Task A on CPU 0 runs percpu_down_read() to completion, and remains > > within the critical section. CPU 0's ->refcount is therefore 1. > > > > o Task B on CPU 1 does percpu_down_write(), which write-acquires > > ->rw_sem, does rcu_sync_enter() (which is a no-op due to > > RCU_NONE), sets ->state to readers_block, and is just now going > > to wait for readers, which first invokes readers_active_check(), > > which starts summing the ->refcount for CPUs 0, 1, and 2, > > obtaining the value 1 so far. > > > > o Task C CPU 2 enters percpu_down_read(), disables preemption, > > increments CPU 2's ->refcount, sees rcu_sync_is_idle() return > > true (so skips __percpu_down_read()), enables preemption, and > > enters its critical section. > > false, so does __percpu_down_read() > > > > > o Task C migrates to CPU 3 and invokes percpu_up_read(), which > > disables preemption, sees rcu_sync_is_idle() return true, calls > > __this_cpu_dec() on CPU 3's ->refcount, and enables preemption. > > The value of CPU 3's ->refcount is thus (unsigned int)-1. > > __percpu_up_read() > > > > > o Task B on CPU 1 continues execution in readers_active_check(), with > > the full sum being zero. > > > > So it looks to me like we have Task A as a writer at the same time that > > Task A is a reader, which would not be so good. > > > > So what am I missing here? > > for RCU_NONE we init rsp->gp_state to !0, which makes: > > static inline rcu_sync_is_idle()'s > > return !rsp->gp_state (aka. rsp->gp_state == 0) > > return false. > > > And a couple of checkpatch nits below. Yes, I had to apply the patch to > > figure out what it was doing. ;-) > > Yah, too much churn to read :-) > > In any case, rest assured you've already gone over this part of the > patch several times. I repurposed an old percpu-rwsem optimization, Oleg > recognised it. OK, in that case I will hold off pending John Stultz's performance checks. Thanx, Paul