On Tue, Feb 24, 2026 at 10:38 AM Pranjal Shrivastava <[email protected]> wrote:
> On Thu, Jan 29, 2026 at 09:24:52PM +0000, David Matlack wrote:
> > + /*
> > + * Reset the device and restore it back to its original state before
> > + * handing it to the next kernel.
> > + *
> > + * Eventually both of these should be dropped and the device should be
> > + * kept running with its current state across the Live Update.
> > + */
> > + if (vdev->reset_works)
> > + ret = __pci_reset_function_locked(pdev);
>
> I see the 'Eventually both of these should be dropped' comment,
> which acknowledges that a reset is a v1 crutch. However, I wanted to
> clarify the fallback strategy here.
>
> If vdev->reset_works is false, we skip the reset but still jump into the
> new kernel. For devices that don't support FLR, are we comfortable jumping
> with the device potentially still hot?
That situation is already possible today. Simply use a VFIO device
that does not support reset, close it, and then kexec into a new
kernel. vfio_pci_core_disable() (which runs when the last reference to
the VFIO device file is dropped) also skips the reset, and kexec does
not reset devices.
Note that we still restore the state, which at least will reset the
device's configuration back to a sane default, including disabling bus
mastering.
> > diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> > index e90859956514..9aa1587fea19 100644
> > --- a/include/linux/vfio.h
> > +++ b/include/linux/vfio.h
> > @@ -81,6 +81,34 @@ struct vfio_device {
> > #endif
> > };
> >
> > +struct vfio_device_file {
> > + struct vfio_device *device;
> > + struct vfio_group *group;
> > +
> > + u8 access_granted;
> > + u32 devid; /* only valid when iommufd is valid */
> > + spinlock_t kvm_ref_lock; /* protect kvm field */
> > + struct kvm *kvm;
> > + struct iommufd_ctx *iommufd; /* protected by struct
> > vfio_device_set::lock */
> > +};
> > +
> > +extern const struct file_operations vfio_device_fops;
> > +
>
> There seem to be two extern declarations for vfio_device_fops in both
> vfio_pci_priv.h and include/linux/vfio.h. Could we consolidate these?
Sure I can clean those up.
> > +static inline struct vfio_device_file *to_vfio_device_file(struct file
> > *file)
> > +{
> > + if (file->f_op != &vfio_device_fops)
> > + return NULL;
> > +
> > + return file->private_data;
> > +}
> > +
> > +static inline struct vfio_device *vfio_device_from_file(struct file *file)
> > +{
> > + struct vfio_device_file *df = to_vfio_device_file(file);
> > +
> > + return df ? df->device : NULL;
> > +}
> > +
>
> I'm a little uncomfortable with this part. Why is it necessary to expose
> the internal vfio_device_file structure to drivers? If this is only to
> support vfio_device_from_file(), could we not keep the structure private
> and just export the helper function instead?
Yeah, no problem.
> Exposing internal state into the public API introduces some maintenance
> constraints for e.g. if vfio_main.c ever changes how it tracks
> file-to-device mappings or its internal security state (like
> access_granted), it now has to worry about breaking external drivers.
>
> I believe we expose the struct just to power these static inline helper
> (mainly vfio_device_from_file) ? Instead, could we treat
> `vfio_device_file` as an opaque type in the public header (like struct
> iommu_group) and move the implementation of vfio_device_from_file() into
> vfio_main.c as an exported symbol? This gives drivers the vfio_device
> pointer they need without leaking the core's private internals.
Good points.