------- Comment From dnban...@us.ibm.com 2018-04-23 23:49 EDT------- I decided to take a look at qla2xxx driver's free and delete paths a little more since my gut feeling was that these kinds of issues have to be encountered by others too. Looking a little deeper I discovered these:
(Note this was from a quick perusal) ######################################## commit d8630bb95f46ea118dede63bd75533faa64f9612 Author: Quinn Tran <quinn.t...@cavium.com> Date: Thu Dec 28 12:33:43 2017 -0800 scsi: qla2xxx: Serialize session deletion by using work_lock for session deletion, replace sess_lock with work_lock. Under certain case sess_lock is not feasiable to acquire. The lock is needed temporarily to make sure a single call to schedule of the work element. ######################################## commit 9cd883f07a54e5301d51e259acd250bb035996be <does as part of its work> + /* use cancel to push work element through before re-queue */ + cancel_work_sync(&sess->del_work); INIT_WORK(&sess->del_work, qla24xx_delete_sess_fn); queue_work(sess->vha->hw->wq, &sess->del_work); ######################################## commit 1ae634eb28533b82f9777a47c1ade44cb8c0182b Author: Quinn Tran <quinn.t...@cavium.com> Date: Thu Dec 28 12:33:44 2017 -0800 scsi: qla2xxx: Serialize session free in qlt_free_session_done Add free_pending flag to serialize queueing of free_work element onto the work queue Signed-off-by: Quinn Tran <quinn.t...@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madh...@cavium.com> Signed-off-by: Martin K. Petersen <martin.peter...@oracle.com> diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c index 72b452d..0d3c3f6 100644 --- a/drivers/scsi/qla2xxx/qla_target.c +++ b/drivers/scsi/qla2xxx/qla_target.c @@ -1105,6 +1105,7 @@ static void qlt_free_session_done(struct work_struct *work) sess->plogi_link[QLT_PLOGI_LINK_SAME_WWN] = NULL; } } + spin_unlock_irqrestore(&ha->tgt.sess_lock, flags); ql_dbg(ql_dbg_tgt_mgt, vha, 0xf001, @@ -1118,6 +1119,9 @@ static void qlt_free_session_done(struct work_struct *work) wake_up_all(&vha->fcport_waitQ); base_vha = pci_get_drvdata(ha->pdev); + + sess->free_pending = 0; + if (test_bit(PFLG_DRIVER_REMOVING, &base_vha->pci_flags)) return; @@ -1140,11 +1144,20 @@ static void qlt_free_session_done(struct work_struct *work) void qlt_unreg_sess(struct fc_port *sess) { struct scsi_qla_host *vha = sess->vha; + unsigned long flags; ql_dbg(ql_dbg_disc, sess->vha, 0x210a, "%s sess %p for deletion %8phC\n", __func__, sess, sess->port_name); + spin_lock_irqsave(&sess->vha->work_lock, flags); + if (sess->free_pending) { + spin_unlock_irqrestore(&sess->vha->work_lock, flags); + return; + } + sess->free_pending = 1; + spin_unlock_irqrestore(&sess->vha->work_lock, flags); + if (sess->se_sess) vha->hw->tgt.tgt_ops->clear_nacl_from_fcport_map(sess); ######################################## The last one is obviously a much more refined and well-thought version of the free_work changes... Obviously the code given in #132 was an attempt to move the testing/debug forward and validate the cause analysis for the crashes. Going forward, these changes (the relevant set -I just did a very quick walk through ...the changes need to be picked carefully) need to be selected with extreme diligence and pulled into the kernel (Canonical?). -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1762844 Title: ISST-LTE:KVM:Ubuntu1804:BostonLC:boslcp3: Host crashed & enters into xmon after moving to 4.15.0-15.16 kernel To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1762844/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs