On Thu, Oct 04, 2018 at 01:57:21PM +0200, Igor Mammedov wrote: > On Wed, 3 Oct 2018 10:44:20 -0700 > open sorcerer <0p3n.s0rc3...@gmail.com> wrote: > > > Hi, > > > > I am digging into an issue where qmp_device_del does not actually delete > > devices when a guest OS is in prelaunch. This seems to be due to the guest > > OS not handling ACPI events because it is not currently running. If I > > assume correctly, qmp should allow you to add/remove devices while the host > > is down, or if not possible, publish an error message. > may I ask why one would delete a device at -S pause point, isn't it easier > to start QEMU without it, to begin with? > > > I think fixing this issue is as simple as making sure that the VM is in a > > safe state to ignore the hotplug ACPI dance but eject the disk, something > > like: > in prelaunch runstate where '-S' option pauses VM, it is practically paused > at the first instruction to be executed. So device_add at that point is > considered as hotplug with all actions already executed on hardware level > (interrupts sent, devices responsible for hotplug handling has changed state). > So if one wished to delete device at that point, one would have to rollback > related state changes. > If one would additionally use -incoming CLI option, it becomes more > complicated > as we might endup in prelaunch runstate with VM in running state > (see possible transitions in runstate_transitions_def[]) > I'd say prelauch runstate can't be used for removing devices that do not > support surprise removal (in our case PCI isn't).
I'd say the point is this. In prelaunch guest did not observe any device state yet, we could make device_add look just like a non-hotplugged device. And we could make device_del pretend there was a reset immediately afterwards. Not sure why it matters to anyone, but it's doable I think. > > prelaunch, preconfig, shutdown: ignore acpi and deal with cleaning devices > > other non-running: bubble up error > > running: default behavior > > > > I was trying to validate that this change would be safe (keep in mind I am > > learning ACPI in little pieces while digging) using GDB, and code > > inspection. While stepping through with GDB i noticed that the PCI slots > > are controlled by memory region and the opaque acpi pci hp state object. I > > was unable this far to find any code executed that modifies the ACPI tables > > beyond just the pci hotplug state. > > > > I also tried to test using "while true; do acpidump | md5; sleep 1; done" > > in the guest OS and then add/remove a virtio-blk-pci device (which > > exercised the ACPI callbacks via piix4 callbacks). The output of the > > acpidump -> md5 was consistent during each phase of the data collection > > which I believe implied that the acpi tables were not modified by the PCI > > hotplug. > > > > Can someone help me understand: > > > > 1. Are the ACPI tables not modified when doing PCI hotplug? > > 2. Do the general changes proposed seem safe? > > 3. Are there resources or documentation I can read to help me understand > > this problem further? I have skimmed through alot of different documents > > and watched some youtube videos, but the ACPI documentation is hard to read > > and sift through and the youtube videos are generally too high level. > Regarding ACPI based PCI hotplug you can look at > docs/specs/acpi_pci_hotplug.txt > hw/acpi/pcihp.c > ACPI AML part in build_append_pci_bus_devices() > > > > > Thanks.