On 10/08/2015 06:35 PM, Kosuke Tatsukawa wrote: > blk_mq_tag_update_depth() seems to be missing a memory barrier which > might cause the waker to not notice the waiter and fail to send a > wake_up as in the following figure. > > blk_mq_tag_update_depth bt_get > ------------------------------------------------------------------------ > if (waitqueue_active(&bs->wait)) > /* The CPU might reorder the test for > the waitqueue up here, before > prior writes complete */ > prepare_to_wait(&bs->wait, &wait, > TASK_UNINTERRUPTIBLE); > tag = __bt_get(hctx, bt, last_tag, > tags); > /* Value set in bt_update_count not > visible yet */ > bt_update_count(&tags->bitmap_tags, tdepth); > /* blk_mq_tag_wakeup_all(tags, false); */ > bt = &tags->bitmap_tags; > wake_index = atomic_read(&bt->wake_index); > ... > io_schedule(); > ------------------------------------------------------------------------ > > This patch adds the missing memory barrier. > > I found this issue when I was looking through the linux source code > for places calling waitqueue_active() before wake_up*(), but without > preceding memory barriers, after sending a patch to fix a similar > issue in drivers/tty/n_tty.c (Details about the original issue can be > found here: https://lkml.org/lkml/2015/9/28/849). > > Signed-off-by: Kosuke Tatsukawa <ta...@ab.jp.nec.com> > --- > block/blk-mq-tag.c | 4 ++++ > 1 files changed, 4 insertions(+), 0 deletions(-) > > diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c > index ed96474..7a6b6e2 100644 > --- a/block/blk-mq-tag.c > +++ b/block/blk-mq-tag.c > @@ -75,6 +75,10 @@ void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool > include_reserve) > struct blk_mq_bitmap_tags *bt; > int i, wake_index; > > + /* > + * Make sure all changes prior to this are visible from other CPUs. > + */ > + smp_mb(); > bt = &tags->bitmap_tags; > wake_index = atomic_read(&bt->wake_index); > for (i = 0; i < BT_WAIT_QUEUES; i++) { >
Thanks, after looking at this, I think this patch is fine. It's not a super hot path, so not worth it to further optimize this or look into ways to avoid the barrier. I do wonder if there are archs where atomic_read() is a memory barrier, in that case we need not do it at all. And perhaps we have some weird smp_before_bla variant that could be used here instead fo improve upon that case. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/