On Tue, Jul 22, 2025 at 10:31:45PM +0200, Nam Cao wrote: > > On Tue, Jul 22, 2025 at 02:05:55PM +0530, Gautam Menghani wrote: > > > I am seeing a boot failure after applying this series on top of the pci > > > tree [1]. Note that this error was seen on a system where I have a > > > dedicated NVME. Systems without dedicated disk boot fine > > > > Thanks for the report. > > > > Using QEMU, I cannot reproduce the exact same problem, but I do observe a > > different one. They are likely from the same root cause. > > > > Let me investigate.. > > So the problem is due to the pair msi_prepare() and msi_post_free(). Before > this series, msi_prepare() is called whenever interrupt is allocated. > However, after this series, msi_prepare() is called only at domain > creation. > > For most device drivers, this difference does not have any impact. However, > the NVME driver is slightly "special", it does this: > > 1. Allocate interrupts > 2. Free interrupts > 3. Allocate interrupts again > > Before this series: > > (1) calls msi_prepare() > (2) calls msi_post_free() > (3) calls msi_prepare() again > > and it happens to work. However, after this series: > > (1) calls msi_prepare() > (2) calls msi_post_free() > (3) does not call either > > and we are in trouble. > > A simple solution is using msi_teardown() instead, which is called at > domain destruction. It makes more sense this way as well, because > msi_teardown() is supposed to reverse what msi_prepare() does. > > This would also remove the only user of msi_post_free(), allowing us to > delete that callback. > > The below patch fixes the problem that I saw with QEMU. Does it fix the > problem on your side as well? > > Best regards, > Nam > > > diff --git a/arch/powerpc/platforms/pseries/msi.c > b/arch/powerpc/platforms/pseries/msi.c > index 70be6e24427d..7da142dd5baa 100644 > --- a/arch/powerpc/platforms/pseries/msi.c > +++ b/arch/powerpc/platforms/pseries/msi.c > @@ -441,12 +441,12 @@ static int pseries_msi_ops_prepare(struct irq_domain > *domain, struct device *dev > * RTAS can not disable one MSI at a time. It's all or nothing. Do it > * at the end after all IRQs have been freed. > */ > -static void pseries_msi_post_free(struct irq_domain *domain, struct device > *dev) > +static void pseries_msi_ops_teardown(struct irq_domain *domain, > msi_alloc_info_t *arg) > { > - if (WARN_ON_ONCE(!dev_is_pci(dev))) > - return; > + struct msi_desc *desc = arg->desc; > + struct pci_dev *pdev = msi_desc_to_pci_dev(desc); > > - rtas_disable_msi(to_pci_dev(dev)); > + rtas_disable_msi(pdev); > } > > static void pseries_msi_shutdown(struct irq_data *d) > @@ -482,7 +482,7 @@ static bool pseries_init_dev_msi_info(struct device *dev, > struct irq_domain *dom > chip->irq_write_msi_msg = pseries_msi_write_msg; > > info->ops->msi_prepare = pseries_msi_ops_prepare; > - info->ops->msi_post_free = pseries_msi_post_free; > + info->ops->msi_teardown = pseries_msi_ops_teardown; > > return true; > }
Hi Nam, The boot issue is fixed with this diff. I'll do some more testing on this series and will post more updates. Thanks, Gautam