On Sat, 4 Mar 2017, Peter Zijlstra wrote:
> @@ -1035,6 +1037,9 @@ static int attach_to_pi_state(u32 __user
>        * has dropped the hb->lock in between queue_me() and unqueue_me_pi(),
>        * which in turn means that futex_lock_pi() still has a reference on
>        * our pi_state.
> +      *
> +      * IOW, we cannot race against the unlocked put_pi_state() in
> +      * futex_unlock_pi().

That 'IOW' made my head spin for a while. I rather prefer to spell it out
more explicitely:

         * The waiter holding a reference on @pi_state protects also
         * against the unlocked put_pi_state() in futex_unlock_pi(),
         * futex_lock_pi() and futex_wait_requeue_pi() as it cannot go to 0
         * and consequentely free pi state before we can take a reference
         * ourself.

>        */
>       WARN_ON(!atomic_read(&pi_state->refcount));
>  
> @@ -1378,47 +1383,33 @@ static void mark_wake_futex(struct wake_
>       smp_store_release(&q->lock_ptr, NULL);
>  }
>  
> -static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q 
> *top_waiter,
> -                      struct futex_hash_bucket *hb)

Please add a comment, that the caller must hold a reference on @pi_state

> +static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_pi_state 
> *pi_state)
>  {
> -     struct task_struct *new_owner;
> -     struct futex_pi_state *pi_state = top_waiter->pi_state;
>       u32 uninitialized_var(curval), newval;
> +     struct task_struct *new_owner;
> +     bool deboost = false;
>       DEFINE_WAKE_Q(wake_q);
> -     bool deboost;
>       int ret = 0;
>  
> -     if (!pi_state)
> -             return -EINVAL;
> -
> -     /*
> -      * If current does not own the pi_state then the futex is
> -      * inconsistent and user space fiddled with the futex value.
> -      */
> -     if (pi_state->owner != current)
> -             return -EINVAL;
> -
>       raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
>       new_owner = rt_mutex_next_owner(&pi_state->pi_mutex);
> -
> -     /*
> -      * When we interleave with futex_lock_pi() where it does
> -      * rt_mutex_timed_futex_lock(), we might observe @this futex_q waiter,
> -      * but the rt_mutex's wait_list can be empty (either still, or again,
> -      * depending on which side we land).
> -      *
> -      * When this happens, give up our locks and try again, giving the
> -      * futex_lock_pi() instance time to complete and unqueue_me().
> -      */
>       if (!new_owner) {
> -             raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
> -             return -EAGAIN;
> +             /*
> +              * Since we held neither hb->lock nor wait_lock when coming
> +              * into this function, we could have raced with futex_lock_pi()
> +              * such that it will have removed the waiter that brought us
> +              * here.

Hmm. That's not entirely correct. There are two cases:

     lock_pi()
        queue_me() <- Makes it visible as waiter in the hash bucket
        unlock(hb->lock)

  [1]

        rtmutex_futex_lock()

  [2]
  
        lock(hb->lock)

Both [1] and [2] are valid reasons why the top waiter is not a rtmutex
waiter.

Thanks,

        tglx

Reply via email to