On 9/27/19 10:19 AM, Yufen Yu wrote:
> We got a null pointer deference BUG_ON in blk_mq_rq_timed_out()
> as following:
>
> [ 108.825472] BUG: kernel NULL pointer dereference, address: 0000000000000040
> [ 108.827059] PGD 0 P4D 0
> [ 108.827313] Oops: 0000 [#1] SMP PTI
> [ 108.827657] CPU: 6 PID: 198 Comm: kworker/6:1H Not tainted 5.3.0-rc8+ #431
> [ 108.829503] Workqueue: kblockd blk_mq_timeout_work
> [ 108.829913] RIP: 0010:blk_mq_check_expired+0x258/0x330
> [ 108.838191] Call Trace:
> [ 108.838406] bt_iter+0x74/0x80
> [ 108.838665] blk_mq_queue_tag_busy_iter+0x204/0x450
> [ 108.839074] ? __switch_to_asm+0x34/0x70
> [ 108.839405] ? blk_mq_stop_hw_queue+0x40/0x40
> [ 108.839823] ? blk_mq_stop_hw_queue+0x40/0x40
> [ 108.840273] ? syscall_return_via_sysret+0xf/0x7f
> [ 108.840732] blk_mq_timeout_work+0x74/0x200
> [ 108.841151] process_one_work+0x297/0x680
> [ 108.841550] worker_thread+0x29c/0x6f0
> [ 108.841926] ? rescuer_thread+0x580/0x580
> [ 108.842344] kthread+0x16a/0x1a0
> [ 108.842666] ? kthread_flush_work+0x170/0x170
> [ 108.843100] ret_from_fork+0x35/0x40
>
> The bug is caused by the race between timeout handle and completion for
> flush request.
>
> When timeout handle function blk_mq_rq_timed_out() try to read
> 'req->q->mq_ops', the 'req' have completed and reinitiated by next
> flush request, which would call blk_rq_init() to clear 'req' as 0.
>
> After commit 12f5b93145 ("blk-mq: Remove generation seqeunce"),
> normal requests lifetime are protected by refcount. Until 'rq->ref'
> drop to zero, the request can really be free. Thus, these requests
> cannot been reused before timeout handle finish.
>
> However, flush request has defined .end_io and rq->end_io() is still
> called even if 'rq->ref' doesn't drop to zero. After that, the 'flush_rq'
> can be reused by the next flush request handle, resulting in null
> pointer deference BUG ON.
>
> We fix this problem by covering flush request with 'rq->ref'.
> If the refcount is not zero, flush_end_io() return and wait the
> last holder recall it. To record the request status, we add a new
> entry 'rq_status', which will be used in flush_end_io().
Thanks, applied.
--
Jens Axboe