Hi all
Our customer met a panic triggered by BUG_ON in blk_finish_request. >From the dmesg log, the BUG_ON was triggered after command abort occurred many >times. There is a race condition in the following scenario. cpu A cpu B kworker interrupt scmd_eh_abort_handler() -> scsi_try_to_abort_cmd() -> qla2xxx_eh_abort() -> qla2x00_eh_wait_on_command() qla2x00_status_entry() -> qla2x00_sp_compl() -> qla2x00_sp_free_dma() -> scsi_queue_insert() -> __scsi_queue_insert() -> blk_requeue_request() -> blk_clear_rq_complete() -> scsi_done -> blk_complete_request -> blk_mark_rq_complete -> elv_requeue_request() -> __blk_complete_request() -> __elv_add_request() // req will be queued here BLK_SOFTIRQ scsi_softirq_done() -> scsi_finish_command() -> scsi_io_completion() -> scsi_end_request() -> blk_finish_request() // BUG_ON(blk_queued_rq(req)) !!! The issue will not be triggered most of time, because the request is marked as complete by timeout path. So the scsi_done from qla2x00_sp_compl does nothing. But as the scenario above, if the complete state has been cleaned by blk_requeue_request, we will get the request both requeued and completed, and then BUG_ON(blk_queued_rq(req)) in blk_finish_request comes up. Is there any solution for this in qla2xxx driver side ? Thanks Jianchao