On Fri, 27 Feb 2026 09:19:28 -0800
David Matlack <[email protected]> wrote:

> On Fri, Feb 27, 2026 at 8:32 AM Alex Williamson <[email protected]> wrote:
> >
> > On Thu, 26 Feb 2026 00:28:28 +0000
> > David Matlack <[email protected]> wrote:  
> > > > > +static int pci_flb_preserve(struct liveupdate_flb_op_args *args)
> > > > > +{
> > > > > + struct pci_dev *dev = NULL;
> > > > > + int max_nr_devices = 0;
> > > > > + struct pci_ser *ser;
> > > > > + unsigned long size;
> > > > > +
> > > > > + for_each_pci_dev(dev)
> > > > > +         max_nr_devices++;  
> > > >
> > > > How is this protected against hotplug?  
> > >
> > > Pranjal raised this as well. Here was my reply:
> > >
> > > .  Yes, it's possible to run out space to preserve devices if devices are
> > > .  hot-plugged and then preserved. But I think it's better to defer
> > > .  handling such a use-case exists (unless you see an obvious simple
> > > .  solution). So far I am not seeing preserving hot-plugged devices
> > > .  across Live Update as a high priority use-case to support.
> > >
> > > I am going to add a comment here in the next revision to clarify that.
> > > I will also add a comment clarifying why this code doesn't bother to
> > > account for VFs created after this call (preserving VFs are explicitly
> > > disallowed to be preserved in this patch since they require additional
> > > support).  
> >
> > TBH, without SR-IOV support and some examples of in-kernel PF
> > preservation in support of vfio-pci VFs, it seems like this only
> > supports a very niche use case.  
> 
> The intent is to start by supporting a simple use-case and expand to
> more complex scenarios over time, including preserving VFs. Full GPU
> passthrough is common at cloud providers so even non-VF preservation
> support is valuable.
> 
> > I expect the majority of vfio-pci
> > devices are VFs and I don't think we want to present a solution where
> > the requirement is to move the PF driver to userspace.  
> 
> JasonG recommended the upstream support for VF preservation be limited
> to cases where the PF is also bound to VFIO:
> 
>   https://lore.kernel.org/lkml/[email protected]/
> 
> Within Google we have a way to support in-kernel PF drivers but we are
> trying to focus on simpler use-cases first upstream.
> 
> > It's not clear,
> > for example, how we can have vfio-pci variant drivers relying on
> > in-kernel channels to PF drivers to support migration in this model.  
> 
> Agree this still needs to be fleshed out and designed. I think the
> roadmap will be something like:
> 
>  1. Get non-VF preservation working end-to-end (device fully preserved
> and doing DMA continuously during Live Update).
>  2. Extend to support VF preservation where the PF is also bound to vfio-pci.
>  3. (Maybe) Extend to support in-kernel PF drivers.
> 
> This series is the first step of #1. I have line of sight to how #2
> could work since it's all VFIO.

Without 3, does this become a mainstream feature?

There's obviously a knee jerk reaction that moving PF drivers into
userspace is a means to circumvent the GPL that was evident at LPC,
even if the real reason is "in-kernel is hard".

Related to that, there's also not much difference between a userspace
driver and an out-of-tree driver when it comes to adding in-kernel code
for their specific support requirements.  Therefore, unless migration is
entirely accomplished via a shared dmabuf between PF and VF,
orchestrated through userspace, I'm not sure how we get to migration,
making KHO vs migration a binary choice.  I have trouble seeing how
that's a viable intermediate step.  Thanks,

Alex

Reply via email to