Re: [PATCH v2 02/22] PCI: Add API to track PCI devices preserved across Live Update

Pranjal Shrivastava Tue, 24 Feb 2026 11:09:47 -0800

On Tue, Feb 24, 2026 at 07:02:56PM +0000, Pranjal Shrivastava wrote:
> On Tue, Feb 24, 2026 at 09:33:28AM -0800, David Matlack wrote:
> > On Tue, Feb 24, 2026 at 1:18 AM Pranjal Shrivastava <[email protected]> 
> > wrote:
> > > On Thu, Jan 29, 2026 at 09:24:49PM +0000, David Matlack wrote:
> > > > + * Copyright (c) 2025, Google LLC.
> > >
> > > Nit: Should these be 2026 now?
> > 
> > Yes! Thanks for catching that.
> > 
> > > > +int pci_liveupdate_outgoing_preserve(struct pci_dev *dev)
> > > > +{
> > > > +     struct pci_dev_ser new = INIT_PCI_DEV_SER(dev);
> > > > +     struct pci_ser *ser;
> > > > +     int i, ret;
> > > > +
> > > > +     /* Preserving VFs is not supported yet. */
> > > > +     if (dev->is_virtfn)
> > > > +             return -EINVAL;
> > > > +
> > > > +     guard(mutex)(&pci_flb_outgoing_lock);
> > > > +
> > > > +     if (dev->liveupdate_outgoing)
> > > > +             return -EBUSY;
> > > > +
> > > > +     ret = liveupdate_flb_get_outgoing(&pci_liveupdate_flb, (void 
> > > > **)&ser);
> > > > +     if (ret)
> > > > +             return ret;
> > > > +
> > > > +     if (ser->nr_devices == ser->max_nr_devices)
> > > > +             return -E2BIG;
> > >
> > > I'm wondering how (or if) this handles hot-plugged devices?
> > > max_nr_devices is calculated based on for_each_pci_dev at the time of
> > > the first preservation.. what happens if a device is hotplugged after
> > > the first device is preserved but before the second one is, does
> > > max_nr_devices become stale? Since ser->max_nr_devices will not reflect
> > > the actual possible device count, potentially leading to an unnecessary
> > > -E2BIG failure?
> > 
> > Yes, it's possible to run out space to preserve devices if devices are
> > hot-plugged and then preserved. But I think it's better to defer
> > handling such a use-case exists (unless you see an obvious simple
> > solution). So far I am not seeing preserving hot-plugged devices
> > across Live Update as a high priority use-case to support.
> > 
> 
> Ack. If we aren't supporting preservation for hot-plug at this point.
> Let's mention that somewhere? Maybe just a little comment or the kdoc?
> 
> > > > +u32 pci_liveupdate_incoming_nr_devices(void)
> > > > +{
> > > > +     struct pci_ser *ser;
> > > > +     int ret;
> > > > +
> > > > +     ret = liveupdate_flb_get_incoming(&pci_liveupdate_flb, (void 
> > > > **)&ser);
> > > > +     if (ret)
> > > > +             return 0;
> > >
> > > Masking this error looks troubled, in the following patch, I see that
> > > the retval 0 is treated as a fresh boot, but the IOMMU mappings for that
> > > BDF might still be preserved? Which could lead to DMA aliasing issues,
> > > without a hint of what happened since we don't even log anything.
> > 
> > All fo the non-0 errors indicate there are 0 incoming devices at the
> > time of the call, so I think returning 0 is appropriate.
> > 
> >  - EOPNOTSUPP: Live Update is not enabled.
> >  - ENODATA: Live Update is finished (all incoming devices have been 
> > restored).
> >  - ENOTENT: No PCI data was preserved across the Live Update.
> >


The flb_retrive_one seems to call:

 err = flb->ops->retrieve(&args);

which could be anything honestly.. since the luo_core doesn't scream
about it, maybe the caller should?

Thanks,
Praan

> > None of these cover the case where an IOMMU mapping for BDF X is
> > preserved, but device X is not preserved. This is a case we should
> > handle in some way... but here is not that place.
> > 
> > >
> > > Maybe we could have something like the following:
> > >
> > > int pci_liveupdate_incoming_nr_devices(void)
> > > {
> > >         struct pci_ser *ser;
> > >         int ret;
> > >
> > >         ret = liveupdate_flb_get_incoming(&pci_liveupdate_flb, (void 
> > > **)&ser);
> > >         if (ret) {
> > >                 if (ret != -ENOENT)
> > >                         pr_warn("PCI: Failed to retrieve preservation 
> > > list: %d\n", ret);
> > 
> > This would cause this warning to get printed if Live Update was
> > disabled, or if no PCI devices were preserved. But both of those are
> > not error scenarios.
> > 
> 
> I agree, the snippet was just an example. What I'm trying to say here
> is, what if the retval is -ENOMEM / -ENODATA, the existing code will
> treat it as a fresh boot because it believes there are no incoming 
> devices. However, since this was an incoming device which failed to be
> retrieved, there's a chance that it's IOMMU mapping was preserved too.
> By returning 0, the PCI core will feel free to rebalance bus numbers or
> reassign BARs. For instance, if the IOMMU already inherited mappings for
> BDF 02:00.0, but the PCI core (due to this masked error) reassigns a 
> different device to that BDF, we face DMA aliasing or IOMMU faults.
> Am I missing some context here?
> 
> > > > +void pci_liveupdate_setup_device(struct pci_dev *dev)
> > > > +{
> > > > +     struct pci_ser *ser;
> > > > +     int ret;
> > > > +
> > > > +     ret = liveupdate_flb_get_incoming(&pci_liveupdate_flb, (void 
> > > > **)&ser);
> > > > +     if (ret)
> > > > +             return;
> > >
> > > We should log something here either at info / debug level since the
> > > error isn't bubbled up and the luo_core doesn't scream about it either.
> > 
> > Any error from liveupdate_flb_get_incoming() simply means there are no
> > incoming devices. So I don't think there's any error to report in
> > dmesg.
> > 
> > > > +     dev->liveupdate_incoming = !!pci_ser_find(ser, dev);
> > >
> > > This feels a little hacky, shall we go for something like:
> > >
> > > dev->liveupdate_incoming = (pci_ser_find(ser, dev) != NULL); ?
> > 
> > In my experience in the kernel (mostly from KVM), explicity comparison
> > to NULL is less preferred to treating a pointer as a boolean. But I'm
> > ok with following whatever is the locally preferred style for this
> > kind of check.
> > 
> 
> No strong feelings there, I see both being used in drivers/pci.
> 
> > > > @@ -582,6 +583,10 @@ struct pci_dev {
> > > >       u8              tph_mode;       /* TPH mode */
> > > >       u8              tph_req_type;   /* TPH requester type */
> > > >  #endif
> > > > +#ifdef CONFIG_LIVEUPDATE
> > > > +     unsigned int    liveupdate_incoming:1;  /* Preserved by previous 
> > > > kernel */
> > > > +     unsigned int    liveupdate_outgoing:1;  /* Preserved for next 
> > > > kernel */
> > > > +#endif
> > > >  };
> > >
> > > This would start another anon bitfield container, should we move this
> > > above within the existing bitfield? If we've run pahole and found this
> > > to be better, then this should be fine.
> > 
> > Yeah I simply appended these new fields to the very end of the struct.
> > If we care about optimizing the packing of struct pci_dev I can find a
> > better place to put it.
> 
> If you have pahole handy, it would be great to see if these can slide 
> into an existing hole. If not, no big deal for v3.. we can keep it as is
> 
> Thanks,
> Praan

Re: [PATCH v2 02/22] PCI: Add API to track PCI devices preserved across Live Update

Reply via email to