On Thu, Nov 29, 2018 at 02:12:32PM +0100, Peter Zijlstra wrote:
> 
> +Cc davidlohr and waiman

> Urgh; so the case where the cmpxchg() fails because it already has a
> wakeup in progress, which then 'violates' our expectation of when the
> wakeup happens.
> 
> Yes, I think this is real, and worse, I think we need to go audit all
> wake_q_add() users and document this behaviour.
> 
> In the ideal case we'd delay the actual wakeup to the last wake_up_q(),
> but I don't think we can easily fix that.

See commit: 1d0dcb3ad9d3 ("futex: Implement lockless wakeups"), I think
that introduces the exact same bug.

Something like the below perhaps, altough this pattern seems to want a
wake_a_add() variant that already assumes get_task_struct().

diff --git a/kernel/futex.c b/kernel/futex.c
index f423f9b6577e..d14971f6ed3d 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1387,11 +1387,7 @@ static void mark_wake_futex(struct wake_q_head *wake_q, 
struct futex_q *q)
        if (WARN(q->pi_state || q->rt_waiter, "refusing to wake PI futex\n"))
                return;
 
-       /*
-        * Queue the task for later wakeup for after we've released
-        * the hb->lock. wake_q_add() grabs reference to p.
-        */
-       wake_q_add(wake_q, p);
+       get_task_struct(p);
        __unqueue_futex(q);
        /*
         * The waiting task can free the futex_q as soon as q->lock_ptr = NULL
@@ -1401,6 +1397,13 @@ static void mark_wake_futex(struct wake_q_head *wake_q, 
struct futex_q *q)
         * plist_del in __unqueue_futex().
         */
        smp_store_release(&q->lock_ptr, NULL);
+
+       /*
+        * Queue the task for later wakeup for after we've released
+        * the hb->lock. wake_q_add() grabs reference to p.
+        */
+       wake_q_add(wake_q, p);
+       put_task_struct(p);
 }
 
 /*

Reply via email to