On Fri, Nov 17, 2017 at 01:56:48PM +0100, Greg Kurz wrote: > A DRC with a pending unplug request releases its associated device at > machine reset time. > > In the case of LMB, when all DRCs for a DIMM device have been reset, > the DIMM gets unplugged, causing guest memory to disappear. This may > be very confusing for anything still using this memory. > > This is exactly what happens with vhost backends, and QEMU aborts > with: > > qemu-system-ppc64: used ring relocated for ring 2 > qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion > `r >= 0' failed. > > The issue is that each DRC registers a QEMU reset handler, and we > don't control the order in which these handlers are called (ie, > a LMB DRC will unplug a DIMM before the virtio device using the > memory on this DIMM could stop its vhost backend). > > To avoid such situations, let's reset DRCs after all devices > have been reset. > > Reported-by: Mallesh N. Koti <mall...@linux.vnet.ibm.com> > Signed-off-by: Greg Kurz <gr...@kaod.org>
Applied to ppc-for-2.11. I'll be looking at preparing a pull request today. > --- > hw/ppc/spapr.c | 21 +++++++++++++++++++++ > hw/ppc/spapr_drc.c | 7 ------- > 2 files changed, 21 insertions(+), 7 deletions(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index 6841bd294b3c..6285f7211f9a 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -1411,6 +1411,19 @@ static void find_unknown_sysbus_device(SysBusDevice > *sbdev, void *opaque) > } > } > > +static int spapr_reset_drcs(Object *child, void *opaque) > +{ > + sPAPRDRConnector *drc = > + (sPAPRDRConnector *) object_dynamic_cast(child, > + TYPE_SPAPR_DR_CONNECTOR); > + > + if (drc) { > + spapr_drc_reset(drc); > + } > + > + return 0; > +} > + > static void ppc_spapr_reset(void) > { > MachineState *machine = MACHINE(qdev_get_machine()); > @@ -1434,6 +1447,14 @@ static void ppc_spapr_reset(void) > } > > qemu_devices_reset(); > + > + /* DRC reset may cause a device to be unplugged. This will cause troubles > + * if this device is used by another device (eg, a running vhost backend > + * will crash QEMU if the DIMM holding the vring goes away). To avoid > such > + * situations, we reset DRCs after all devices have been reset. > + */ > + object_child_foreach_recursive(object_get_root(), spapr_reset_drcs, > NULL); > + > spapr_clear_pending_events(spapr); > > /* > diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c > index 915e9b51c40c..e3b122968e89 100644 > --- a/hw/ppc/spapr_drc.c > +++ b/hw/ppc/spapr_drc.c > @@ -455,11 +455,6 @@ void spapr_drc_reset(sPAPRDRConnector *drc) > } > } > > -static void drc_reset(void *opaque) > -{ > - spapr_drc_reset(SPAPR_DR_CONNECTOR(opaque)); > -} > - > bool spapr_drc_needed(void *opaque) > { > sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque; > @@ -518,7 +513,6 @@ static void realize(DeviceState *d, Error **errp) > } > vmstate_register(DEVICE(drc), spapr_drc_index(drc), &vmstate_spapr_drc, > drc); > - qemu_register_reset(drc_reset, drc); > trace_spapr_drc_realize_complete(spapr_drc_index(drc)); > } > > @@ -529,7 +523,6 @@ static void unrealize(DeviceState *d, Error **errp) > gchar *name; > > trace_spapr_drc_unrealize(spapr_drc_index(drc)); > - qemu_unregister_reset(drc_reset, drc); > vmstate_unregister(DEVICE(drc), &vmstate_spapr_drc, drc); > root_container = container_get(object_get_root(), DRC_CONTAINER_PATH); > name = g_strdup_printf("%x", spapr_drc_index(drc)); > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature