Hi Dave, On 7/6/21 4:19 PM, Dr. David Alan Gilbert wrote: > * Eric Auger (eric.au...@redhat.com) wrote: >> Hi, >> >> On 7/6/21 10:18 AM, Kunkun Jiang wrote: >>> Hi Eric, >>> >>> On 2021/6/30 17:16, Eric Auger wrote: >>>> On 6/30/21 3:38 AM, Kunkun Jiang wrote: >>>>> On 2021/6/30 4:14, Eric Auger wrote: >>>>>> Hi Kunkun, >>>>>> >>>>>> On 6/29/21 11:33 AM, Kunkun Jiang wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Accroding to the patch cddafd8f353d2d251b1a5c6c948a577a85838582, >>>>>>> our original intention is to flush the ITS tables into guest RAM at >>>>>>> the point >>>>>>> RUN_STATE_FINISH_MIGRATE, but sometimes the VM gets stopped before >>>>>>> migration launch so let's simply flush the tables each time the VM >>>>>>> gets >>>>>>> stopped. >>>>>>> >>>>>>> But I encountered an error when I shut down the virtual machine. >>>>>>> >>>>>>>> qemu-system-aarch64: KVM_SET_DEVICE_ATTR failed: Group 4 attr >>>>>>>> 0x0000000000000001: Permission denied >>>>>>> Shall we need to flush ITS tables into guest RAM when 'shutdown' the >>>>>>> VM? >>>>>>> Or do you think this error is normal? >>>>>> yes we determined in the past this was the right moment do save the >>>>>> tables >>>>>> >>>>>> "with a live migration the guest is still running after >>>>>> the RAM has been first saved, and so the tables may still change >>>>>> during the iterative RAM save. You would actually need to do this >>>>>> at just the point we stop the VM before the final RAM save; that >>>>>> *might* >>>>>> be possible by using a notifier hook a vm run state change to >>>>>> RUN_STATE_FINISH_MIGRATE >>>>>> - if you can do the changes just as the migration flips into that mode >>>>>> it *should* work. " said David. >>>>>> >>>>>> But sometimes as the commit msg says, the VM is stopped before the >>>>>> migration launch - I do not remember the exact scenario tbh -. >>>>> Well, I initially wanted to know more about this scenario to determine >>>>> whether >>>>> a normal shutdown would fall into it.😂 >>>> I think it was for save/restore use case. In that case you need to flush >>>> the KVM cache in memory on VM shutdown. >>> Sorry for late reply. >>> >>> Can we distinguish from the 'RunState'? >>> When we stop the VM, the RunState will be set. There are many types of >>> RunState, such as RUN_STATE_FINISH_MIGRATE/RUN_STATE_SAVE_VM/ >>> RUN_STATE_SHUTDOWN/RUN_STATE_GUEST_PANICKED, etc. >>> >>> Maybe RUN_STATE_SHUTDOWN doesn't belong to save/restore use case, >>> right? >> Adding Dave, Juan and Peter in the loop for migration expertise. >> >> At the moment we save the ARM ITS MSI controller tables whenever the VM >> gets stopped. Saving the tables from KVM caches into the guest RAM is >> needed for migration and save/restore use cases. >> However with GICv4 this fails at KVM level because some MSIs are >> forwarded and saving their state is not supported with GICv4. >> >> While GICv4 migration is not supported we would like the VM to work >> properly, ie. being stoppable without taking care of table saving. >> >> So could we be more precise and identifiy the save/restore and migration >> use cases instead of saving the tables on each VM shutdown. > During the precopy migration (not sure about others), we do: > > static void migration_completion(MigrationState *s) > { > .... > ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); > ... > ret = qemu_savevm_state_complete_precopy(s->to_dst_file, > false, > inactivate); > > so I think we do have that state there to hook off.
That's consistent with what you suggested in the past ans what is logged in the commit message of cddafd8f353d2d251b1a5c6c948a577a85838582 ("hw/intc/arm_gicv3_its: Implement state save/restore"). However does the save/restore enters that state. If I remember correctly that's why I decided to do the save on each VM stop instead. > >> The tables are saved into guest RAM so when need the CPUs and devices to >> be stopped but we need the guest RAM to be saved after the ITS save >> operation. > Yeh so what should happen is that you: > a) Iterate RAM a lot > b) You stop everything > -> Flushes remaining changes into RAM > c) Transmit device state and last bits of RAM changes. > > so that flush should happen at (b). That's correct. Thanks Eric > > Dave > >> Kunkun, by the way you currently just get an error from qemu, ie. qemu >> does not exit? Couldn't we just ignore -EACCESS error? >> >> Thanks >> >> Eric >> >> >> >> >>>>> In my opinion, when the virtual machine is normally shutdown, flushing >>>>> the >>>>> ITS tables is not necessary. If we can't tell the difference between >>>>> 'normal shutdown' >>>>> and the above scenario, then this 'error' is inevitable. >>>>>> So each time the VM is stopped we flush the caches into guest RAM. >>>>>> >>>>>> >>>>>> >>>>>>> This error occurs in the following scenario: >>>>>>> Kunpeng 920 、enable GICv4、passthrough a accelerator Hisilicon SEC >>>>>>> to >>>>>>> the VM. >>>>>>> >>>>>>> The flow is as follows: >>>>>>> >>>>>>> QEMU: >>>>>>> vm_shutdown >>>>>>> do_vm_stop(RUN_STATE_SHUTDOWN) >>>>>>> vm_state_notify >>>>>>> ... >>>>>>> vm_change_state_handler (hw/intc/arm_gicv3_its_kvm.c) >>>>>>> kvm_device_access >>>>>>> >>>>>>> Kernel: >>>>>>> vgic_its_save_tables_v0 >>>>>>> vgic_its_save_device_tables >>>>>>> vgic_its_save_itt >>>>>>> >>>>>>> There is such a code in vgic_its_save_itt(): >>>>>>>> /* >>>>>>>> * If an LPI carries the HW bit, this means that this >>>>>>>> * interrupt is controlled by GICv4, and we do not >>>>>>>> * have direct access to that state without GICv4.1. >>>>>>>> * Let's simply fail the save operation... >>>>>>>> */ >>>>>>>> if (ite->irq->hw && !kvm_vgic_global_state.has_gicv4_1) >>>>>>>> return -EACCES; >>>> Maybe we miss a piece of code for 4.0 that unsets the forwarding. The >>>> only way to handle this is to make sure ite->irq->hw is not set on >>>> shutdown, no? >>> It's not going to return -EACCES here, if we unset the forwarding >>> first. But >>> this may cause problems in save/restore scenarios. The GICv4 architecture >>> doesn't give any guarantee that the pending state is written into the >>> pending table. >>> >>> Thanks, >>> Kunkun Jiang >>>> Thanks >>>> >>>> Eric >>>>>> As far as I understand you need a v4.1 to migrate, >>>>>> following Shenming's series >>>>>> [PATCH v5 0/6] KVM: arm64: Add VLPI migration support on GICv4.1 >>>>>> Maybe sync with him? >>>>> Yes, GICv4 does not support live migrate. >>>>> >>>>> Thanks, >>>>> Kunkun Jiang >>>>>> Thanks >>>>>> >>>>>> Eric >>>>>> >>>>>> >>>>>>> Looking forward to your reply. >>>>>>> >>>>>>> Thanks, >>>>>>> Kunkun Jiang >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> . >>>> . >>>