On Tue, 2012-04-10 at 18:45 +0300, Michael S. Tsirkin wrote: > On Tue, Apr 10, 2012 at 09:35:39AM -0600, Alex Williamson wrote: > > On Tue, 2012-04-10 at 18:19 +0300, Michael S. Tsirkin wrote: > > > On Tue, Apr 10, 2012 at 09:14:00AM -0600, Alex Williamson wrote: > > > > On Sun, 2012-04-08 at 01:57 -0300, Marcelo Tosatti wrote: > > > > > On Thu, Apr 05, 2012 at 11:07:15AM -0600, Alex Williamson wrote: > > > > > > As Michael Tsirkin demonstrated, current PCI hotplug is vulnerable > > > > > > to a few races. The first is a race with other hotplug operations > > > > > > because we clear the up & down registers at each event. If a new > > > > > > event comes before the last is processed, up/down is cleared and > > > > > > the event is lost. > > > > > > > > > > > > To fix this for the down register, we create a life cycle for > > > > > > the event request that starts with the hot unplug request in > > > > > > piix4_device_hotplug() and ends when the device is ejected. > > > > > > This allows us to mask and clear individual bits, preserving them > > > > > > against races. For the up register, we have no clear end point > > > > > > for when the event is finished. We could modify the BIOS to > > > > > > acknowledge the bit and clear it, but this creates BIOS compatibiliy > > > > > > issues without offering a complete solution. Instead we note that > > > > > > gratuitous ACPI device checks are not harmful, which allows us to > > > > > > issue a device check for every slot. We know which slots are > > > > > > present > > > > > > and we know which slots are hotpluggable, so we can easily reduce > > > > > > this to a more manageable set for the guest. > > > > > > > > > > > > The other race Michael noted was that an unplug request followed > > > > > > by reset may also lose the eject notification, which may also > > > > > > result in the eject request being lost which a subsequent add > > > > > > or remove. Once we're in reset, the device is unused and we can > > > > > > flush the queue of device removals ourselves. Previously if a > > > > > > device_del was issued to a guest without ACPI PCI hotplug support, > > > > > > it was necessary to shutdown the guest to recover the device. > > > > > > With this, a guest reboot is sufficient. > > > > > > > > > > > > Signed-off-by: Alex Williamson <alex.william...@redhat.com> > > > > > > --- > > > > > > > > > > > > hw/acpi_piix4.c | 74 > > > > > > +++++++++++++++++++++++++++++++++++++++---------------- > > > > > > 1 files changed, 53 insertions(+), 21 deletions(-) > > > > > > > > > > > > diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c > > > > > > index 5e8b261..0e7af51 100644 > > > > > > --- a/hw/acpi_piix4.c > > > > > > +++ b/hw/acpi_piix4.c > > > > > > @@ -48,7 +48,7 @@ > > > > > > #define PIIX4_PCI_HOTPLUG_STATUS 2 > > > > > > > > > > > > struct pci_status { > > > > > > - uint32_t up; > > > > > > + uint32_t up; /* deprecated, maintained for migration > > > > > > compatibility */ > > > > > > uint32_t down; > > > > > > }; > > > > > > > > > > > > @@ -70,6 +70,7 @@ typedef struct PIIX4PMState { > > > > > > /* for pci hotplug */ > > > > > > struct pci_status pci0_status; > > > > > > uint32_t pci0_hotplug_enable; > > > > > > + uint32_t pci0_slot_device_present; > > > > > > > > > > Why can't you use the "up" field for this? Fail to see the point > > > > > of the new variable. > > > > > > > > I suppose we could. Initially I was creating the present device bitmap > > > > dynamically, but walking the device tree on every access seemed > > > > excessive versus updating it runtime. Since there's no need to migrate > > > > this and we don't want to have it clobbered by an incoming migration, I > > > > didn't think to re-use the "up" field. We could do it, but it would > > > > mean introducing a post_load function that effectively just calls > > > > piix4_update_hotplug to recreate it, which seems like an unnecessary > > > > effort to avoid using an extra 4 bytes of memory. > > > > > > It's probably harmless if we do let it be clobbered by migration > > > though: worst case we lose an event and that might have > > > happened before migration :) > > > > Perhaps that's another way to manage it, just let it be a lazy > > accumulation of anything that has been hotadded. That makes debug hard > > though because we can't know what should be set without looking that the > > entire history of the vm, versus something like "device present", which > > we can verify at any point in time. Thanks, > > > > Alex > > Good point. I think Marcelo's comment can be addressed > if you rename up to up_ignored, old_up, up_legacy or something > like that.
Easy enough to rename .up to .old_up. Marcelo, are you looking for more than that? Thanks, Alex