This patch adds a comment before waitqueue_active noting that memory
barriers are required.

In the following code, the wake_up thread might fail to wake up the
waiting thread and leave it sleeping due to lack of memory barriers.

     wake_up thread                 waiting thread
------------------------------------------------------------------------
CONDITION = 1;                  add_wait_queue(wq, &wait);
if (waitqueue_active(wq))       for (;;) {
        wake_up(wq);                    if (CONDITION)
                                                break;
                                        wait_woken(&wait, ...);
                                }
------------------------------------------------------------------------

There are two problems that can occur.
First, on the wake_up thread side, the CPU can reorder waitqueue_active
to happen before the store.
     wake_up thread                 waiting thread
       (reordered)
------------------------------------------------------------------------
if (waitqueue_active(wq))
                                add_wait_queue(wq, &wait);
                                for (;;) {
                                        if (CONDITION)
                                                break;
CONDITION = 1;
                                        wait_woken(&wait, ...);
                                }
------------------------------------------------------------------------

Second, on the waiting thread side, the CPU can reorder the load of
CONDITION to occur during add_wait_queue active, before the entry is
added to the wait queue.
     wake_up thread                 waiting thread
                                      (reordered)
------------------------------------------------------------------------
                                spin_lock_irqsave(...)      <add_wait_queue>
                                if (CONDITION)
CONDITION = 1;
if (waitqueue_active(wq))
                                __add_wait_queue(...)       <add_wait_queue>
                                spin_unlock_irqrestore(...) <add_wait_queue>
                                wait_woken(&wait, ...);
------------------------------------------------------------------------

Both problems can be fixed by removing the waitqueue_active() call at
the cost of calling spin_lock and spin_unlock even when the queue is
empty.

However, if that is too expensive, the reordering could be prevented by
adding memory barriers in the following places.
     wake_up thread                 waiting thread
------------------------------------------------------------------------
CONDITION = 1;                  add_wait_queue(wq, &wait);
smp_mb();                       smp_mb();
if (waitqueue_active(wq))       for (;;) {
        wake_up(wq);                    if (CONDITION)
                                                break;
                                        wait_woken(&wait, ...);
                                }
------------------------------------------------------------------------
If the waiting thread uses prepare_to_wait() or wait_event*() instead of
directly calling add_wait_queue(), set_current_state() called within
those functions contains the necessary memory barrier.  The memory
barrier in the wake_up thread is still needed.

There were several places in the linux kernel source code which lacked
the memory barrier.  Hopefully, the comment will make people using
waitqueue_active a little more cautious.

Signed-off-by: Kosuke Tatsukawa <ta...@ab.jp.nec.com>
---
 include/linux/wait.h |   13 +++++++++++++
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index 1e1bf9f..4a4c6fc 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -102,6 +102,19 @@ init_waitqueue_func_entry(wait_queue_t *q, 
wait_queue_func_t func)
        q->func         = func;
 }
 
+/*
+ * Note: When adding waitqueue_active before calling wake_up for
+ * optimization, some sort of memory barrier is required on SMP so
+ * that the waiting thread does not miss the wake up.
+ *
+ * A memory barrier is required before waitqueue_active to prevent
+ * waitqueue_active from being reordered by the CPU before any writes
+ * done prior to it.
+ *
+ * The waiting side also needs a memory barrier which pairs with the
+ * wake_up side.  If prepare_to_wait() or wait_event*() is used, they
+ * contain the memory barrier in set_current_state().
+ */
 static inline int waitqueue_active(wait_queue_head_t *q)
 {
        return !list_empty(&q->task_list);
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to