On Mon, May 09, 2016 at 08:56:07AM -0700, Davidlohr Bueso wrote: > On Mon, 09 May 2016, Peter Zijlstra wrote: > > >On Sun, May 08, 2016 at 09:56:08PM -0700, Davidlohr Bueso wrote: > >>Read waiters are currently reference counted from the time it enters > >>the slowpath until the lock is released and the waiter is awoken. This > >>is fragile and superfluous considering everything occurs within down_read() > >>without returning to the caller, and the very nature of the primitive does > >>not suggest that the task can disappear from underneath us. In addition, > >>spurious wakeups can make the whole refcount useless as get_task_struct() > >>is only called when setting up the waiter. > > > >So I think you're wrong here; imagine this: > > > > > > rwsem_down_read_failed() rwsem_wake() > > get_task_struct(); > > raw_spin_lock_irq(&wait_lock); > > list_add_tail(&waiter.list, &wait_list); > > raw_spin_unlock_irq(&wait_lock); > > > > raw_spin_lock_irqsave(&wait_lock) > > __rwsem_do_wake() > > while (true) { > > set_task_state(tsk, TASK_UNINTERRUPTIBLE); > > waiter->task = NULL > > if (!waiter.task) // true > > break; > > > > __set_task_state(tsk, TASK_RUNNING); > > > > do_exit(); > > > > wake_up_process(tsk); /* BOOM */ > > I may be missing something, but rwsem_down_read_failed() will not return until > after the wakeup is done by the rwsem_wake() thread.
The above never gets to schedule(), and even if it did, a spurious wakeup could've happened, no? > So racing with do_exit() isn't > going to occur because the task is still blocked at that point. This is even > more > so with delaying the wakeup. Similarly, we don't do this for writers either, > which > could also suffer from similar scenarios. The write side is different; it serializes on wait_lock. See how it takes wait_lock again, after blocking, and removes itself from the wait_list. Readers do not do this, they rely on the waker to remove them, and therefore suffer this problem.