On Mon, Nov 23, 2020 at 11:14:38AM +0800, Shenming Lu wrote: > External email: Use caution opening links or attachments > > > On 2020/11/21 6:01, Alex Williamson wrote: > > On Fri, 20 Nov 2020 22:05:49 +0800 > > Shenming Lu <lushenm...@huawei.com> wrote: > > > >> On 2020/11/20 1:41, Alex Williamson wrote: > >>> On Thu, 19 Nov 2020 14:13:24 +0530 > >>> Kirti Wankhede <kwankh...@nvidia.com> wrote: > >>> > >>>> On 11/14/2020 2:47 PM, Shenming Lu wrote: > >>>>> When running VFIO migration, I found that the restoring of VFIO PCI > >>>>> device’s > >>>>> config space is before VGIC on ARM64 target. But generally, interrupt > >>>>> controllers > >>>>> need to be restored before PCI devices. > >>>> > >>>> Is there any other way by which VGIC can be restored before PCI device? > >> > >> As far as I know, it seems to have to depend on priorities in the > >> non-iterable process. > >> > >>>> > >>>>> Besides, if a VFIO PCI device is > >>>>> configured to have directly-injected MSIs (VLPIs), the restoring of its > >>>>> config > >>>>> space will trigger the configuring of these VLPIs (in kernel), where it > >>>>> would > >>>>> return an error as I saw due to the dependency on kvm’s vgic. > >>>>> > >>>> > >>>> Can this be fixed in kernel to re-initialize the kernel state? > >> > >> Did you mean to reconfigure these VLPIs when restoring kvm's vgic? > >> But the fact is that this error is not caused by kernel, it is due to the > >> incorrect > >> calling order of qemu... > >> > >>>> > >>>>> To avoid this, we can move the saving of the config space from the > >>>>> iterable > >>>>> process to the non-iterable process, so that it will be called after > >>>>> VGIC > >>>>> according to their priorities. > >>>>> > >>>> > >>>> With this change, at resume side, pre-copy phase data would reach > >>>> destination without restored config space. VFIO device on destination > >>>> might need it's config space setup and validated before it can accept > >>>> further VFIO device specific migration state. > >>>> > >>>> This also changes bit-stream, so it would break migration with original > >>>> migration patch-set. > >>> > >>> Config space can continue to change while in pre-copy, if we're only > >>> sending config space at the initiation of pre-copy, how are any changes > >>> that might occur before the VM is stopped conveyed to the target? For > >>> example the guest might reboot and a device returned to INTx mode from > >>> MSI during pre-copy. Thanks, > >> > >> What I see is that the config space is only saved once in > >> save_live_complete_precopy > >> currently... > >> As you said, a VFIO device might need it's config space setup first, and > >> the config space can continue to change while in pre-copy, Did you mean we > >> have to migrate the config space in save_live_iterate? > >> However, I still have a little doubt about the restoring dependence between > >> the qemu emulated config space and the device data... > >> > >> Besides, if we surely can't move the saving of the config space back, can > >> we > >> just move some actions which are triggered by the restoring of the config > >> space > >> back (such as vfio_msix_enable())? > > > > It seems that the significant benefit to enabling interrupts during > > pre-copy would be to reduce the latency and failure potential during > > the final phase of migration. Do we have any data for how much it adds > > to the device contributed downtime to configure interrupts only at the > > final stage? My guess is that it's a measurable delay on its own. At > > the same time, we can't ignore the differences in machine specific > > dependencies and if we don't even sync the config space once the VM is > > stopped... this all seems not ready to call supported, especially if we > > have concerns already about migration bit-stream compatibility. > > > > I have another question for this, if we restore the config space while in > pre-copy > (include enabling interrupts), does it affect the _RESUMING state (paused) of > the > device on the dst host (cause it to send interrupts? which should not be > allowed > in this stage). Does the restore sequence need to be further discussed and > reach > a consensus(spec) (taking into account other devices and the corresponding > actions > of the vendor driver)? > > > Given our timing relative to QEMU 5.2, the only path I feel comfortable > > with is to move forward with downgrading vfio migration support to be > > enabled via an experimental option. Objections? Thanks, > > Alright, but this issue is related to our ARM GICv4.1 migration scheme, could > you > give a rough idea about this (where to enable interrupts, we hope it to be > after > the restoring of VGIC)?
I disagree. If this is only specific to Huawei ARM GIC implementation, why do we want to make the entire VFIO based migration an experimental feature? Thanks, Neo > > Thanks, > Shenming