On Thu, Jan 12, 2023 at 06:45:55PM +0100, Klaus Jensen wrote: > On Jan 12 09:34, Keith Busch wrote: > > On Thu, Jan 12, 2023 at 02:10:51PM +0100, Klaus Jensen wrote: > > > > > > The pin-based interrupt logic in hw/nvme seems sound enough to me, so I > > > am wondering if there is something going on with the kernel driver (but > > > I certainly do not rule out that hw/nvme is at fault here, since > > > pin-based interrupts has also been a source of several issues in the > > > past). > > > > Does it work if you change the pci_irq_assert() back to pci_irq_pulse()? > > While probably not the "correct" thing to do, it has better results in > > my testing. > > > > A simple s/pci_irq_assert/pci_irq_pulse broke the device. However, > > diff --git i/hw/nvme/ctrl.c w/hw/nvme/ctrl.c > index 03760ddeae8c..0fc46dcb9ec4 100644 > --- i/hw/nvme/ctrl.c > +++ w/hw/nvme/ctrl.c > @@ -477,6 +477,7 @@ static void nvme_irq_check(NvmeCtrl *n) > return; > } > if (~intms & n->irq_status) { > + pci_irq_deassert(&n->parent_obj); > pci_irq_assert(&n->parent_obj); > } else { > pci_irq_deassert(&n->parent_obj); > > > seems to do the trick (pulse is the other way around, assert, then > deassert). > > Probably not the "correct" thing to do, but I'll take it since it seems > to fix it. On a simple boot loop I got the timeout about 1 out of 5. I'm > on ~20 runs now and have not encountered it. > > I'll see if I can set up a mips rootfs and test that. Guenter, what MIPS > machine/board(s) are you testing?
Could you try the below? --- diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 2c85de4700..521c3c80c1 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -558,6 +558,20 @@ static void nvme_irq_deassert(NvmeCtrl *n, NvmeCQueue *cq) } } +static void nvme_irq_pulse(NvmeCtrl *n, NvmeCQueue *cq) +{ + if (!cq->irq_enabled) { + return; + } + + if (msix_enabled(&(n->parent_obj))) { + msix_notify(&(n->parent_obj), cq->vector); + return; + } + + pci_irq_pulse(&n->parent_obj); +} + static void nvme_req_clear(NvmeRequest *req) { req->ns = NULL; @@ -6917,6 +6931,12 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val) } nvme_irq_deassert(n, cq); + } else { + /* + * Retrigger the irq just to make sure the host has no excuse for + * not knowing there's more work to complete on this CQ. + */ + nvme_irq_pulse(n, cq); } } else { /* Submission queue doorbell write */ --