On 12/08/14 17:49, Jens Axboe wrote: > On 12/08/2014 07:55 AM, Bart Van Assche wrote: >> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c >> index 67ab88b..e88af88 100644 >> --- a/block/blk-mq-tag.c >> +++ b/block/blk-mq-tag.c >> @@ -256,6 +256,8 @@ static int bt_get(struct blk_mq_alloc_data *data, >> break; >> } >> >> + blk_mq_run_hw_queue(hctx, false); >> + >> blk_mq_put_ctx(data->ctx); >> >> io_schedule(); > > Ah yes, that could be an issue for some cases, we do need to run the > queue there. For a tag map shared across hardware queues, we might need > to run more than just the current queue, however. For now we can safely > assume that we allocate fairly, so it should not be an issue. > > It might be worth experimenting with doing a __bt_get() after the queue > run before going to sleep, in case the queue run found completions as well.
Unless anyone objects I will start testing the following patch: [PATCH] blk-mq: Fix bt_get() hang Avoid that if there are fewer hardware queues than CPU threads that bt_get() can hang. The symptoms of the hang were as follows: * All tags allocated for a particular hardware queue. * (nr_tags) pending commands for that hardware queue. * No pending commands for the software queues associated with that hardware queue. The call stack that corresponds to the hang is as follows: io_schedule+0x9c/0x130 bt_get+0xef/0x180 blk_mq_get_tag+0x9f/0xd0 __blk_mq_alloc_request+0x16/0x1f0 blk_mq_map_request+0x123/0x130 blk_mq_make_request+0x69/0x280 generic_make_request+0xc0/0x110 submit_bio+0x64/0x130 do_blockdev_direct_IO+0x1dc8/0x2da0 __blockdev_direct_IO+0x47/0x50 blkdev_direct_IO+0x49/0x50 generic_file_read_iter+0x546/0x610 blkdev_read_iter+0x32/0x40 aio_run_iocb+0x1f8/0x400 do_io_submit+0x121/0x490 SyS_io_submit+0xb/0x10 system_call_fastpath+0x12/0x17 --- block/blk-mq-tag.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index c22491e..14ab120 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -256,6 +256,12 @@ static int bt_get(struct blk_mq_alloc_data *data, if (tag != -1) break; + blk_mq_run_hw_queue(hctx, false); + + tag = __bt_get(hctx, bt, last_tag); + if (tag != -1) + break; + blk_mq_put_ctx(data->ctx); io_schedule(); -- 2.1.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/