Re: [Qemu-devel] [PATCH for-2.8 1/3] migration: alternative way to set instance_id in SaveStateEntry

Michael Roth Wed, 30 Nov 2016 14:25:58 -0800

Quoting Michael Roth (2016-11-22 16:58:44)
> Quoting David Gibson (2016-11-22 00:15:10)
> > On Thu, Nov 17, 2016 at 07:40:25PM -0600, Michael Roth wrote:
> > > From: Jianjun Duan <du...@linux.vnet.ibm.com>
> > > 
> > > Currently migrated Devices are identified with an idstr which is
> > > calculated automatically using their path in the QOM composition
> > > tree. In some cases these Devices may have a more reliable
> > > identifier that may be preferable in situations where there's a
> > > chance their path in the composition tree might change in the
> > > future as a resulting of changes in how the device is modeled
> > > in QEMU.
> > > 
> > > One such Device is the "spapr-dr-connector" used to handle hotplug
> > > for various resources on pseries machine types, where the PAPR
> > > specification mandates that each DRC have a 32-bit globally unique
> > > identifier associated with it. As such, this identifier is also ideal
> > > as a reliable way to identify a particular DRC in the migration
> > > stream, so we introduce support here for using a caller-side supplied
> > > instance_id for Devices in preparation for that.
> > > 
> > > register_savevm_live() and vmstate_register_with_alias_id() already
> > > provide a means to let the caller supply an instance_id instead of
> > > relying on the migration subsystem to generate one automatically,
> > > but in cases where we're registering SaveVMHandlers or VMSDs
> > > (respectively) that are associated with a Device, the instance_id is
> > > ignored in favor of a string identifier based solely on the QOM path.
> > > 
> > > This patch generalizes this so that an instance_id can also be
> > > supplied by the caller for Devices. Since VMSD registration for
> > > Devices is generally handled automatically by qdev, we also introduce
> > > a DeviceClass->dev_get_instance_id() method that, when set, is called
> > > by qdev to obtain the corresponding instance_id that should be used
> > > for a particular Device. Otherwise we maintain the original behavior
> > > of passing instance_id == -1 and thus falling back to the previous
> > > logic of using the QOM path.
> > > 
> > > Signed-off-by: Jianjun Duan <du...@linux.vnet.ibm.com>
> > > * moved usage of DeviceClass->dev_get_instance_id() out of savevm.c
> > >   and into caller (qdev) instead
> > > * clarified usage/intent in comments and commit msg
> > > Signed-off-by: Michael Roth <mdr...@linux.vnet.ibm.com>
> > 
> > I've had to remove this patch and 2/3 from ppc-for-2.8, because I
> > discovered (as I was preparing a pull request) that it causes a weird
> > breakage.
> 
> Sorry for the breakage. I will make sure to run make check on ppc64
> beforehand next time as well. Though in this case we might've gotten
> lucky that it happened to trigger at all...
> 
> > 
> > Specifically, on some, but not all, setups it causes the postcopy-test
> > to fail with the error:
> > 
> > qemu-system-ppc64: RP: Received invalid message 0x0000 length 0x0000
> > FAIL
> > 
> > For me, it fails when running on a ppc64le or ppc64 host (RHEL7.3),
> > and on 32-bit x86 (Fedora container) but not on x86_64 host (Fedora
> > 24).
> > 
> > That's a pretty baffling set of symptoms, and so far I haven't gotten
> > far in figuring out how it could happen.  But I really want to get the
> > rest of the patches in ppc-for-2.8 pulled, so I've dropped these for
> > now.
> > 
> > I'll try to debug this once I've prepared the next pull request, but
> > if you're able to work out what's going on for me, that would be
> > extremely helpful.
> 
> It turns out all these strange connections might just be red herrings.
> I think the problem is here in spapr_pci_post_load:
> 
>     unsigned int bus_no = 0;
> 
>     /* Set detach_cb for the drc unconditionally after migration */
>     if (bus) {
>         pci_for_each_device(bus, pci_bus_num(bus), spapr_pci_set_detach_cb,
>                             &bus_no);
>     }
> 
> I think that was a copy/paste artifact from one of the other callsites in
> spapr_pci.c. &bus_no is an opaque that spapr_pci_set_detach_cb ends up trying
> to use as sPAPRMachineState, which was leading to a NULL pointer dereference
> when trying to access the DRC corresponding the the PCIDevice. The search
> query is something like:
> 
>     return spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_PCI,
>                                     (phb->index << 16) |
>                                     (busnr << 8) |
>                                     devfn);
> 
> My guess is that in some cases phb->index just happened to be 0, which would
> in most cases lead to a successful query. That might explain the different
> results on different architectures.
> 
> Based on the some discussion elsehwere is turns out we don't need this
> detach cb stuff at all for the unplug issue we're trying to fix here, so
> I will strip it out for the next version.
> 
> > 
> > I do have one vague theory..
> 
> I think that needs to be dealt with as well, more comments below.
> 
> > 
> > > ---
> > >  hw/core/qdev.c         | 6 +++++-
> > >  include/hw/qdev-core.h | 9 +++++++++
> > >  migration/savevm.c     | 4 ++--
> > >  3 files changed, 16 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> > > index 5783442..c8c0c44 100644
> > > --- a/hw/core/qdev.c
> > > +++ b/hw/core/qdev.c
> > > @@ -933,7 +933,11 @@ static void device_set_realized(Object *obj, bool 
> > > value, Error **errp)
> > >          }
> > >  
> > >          if (qdev_get_vmsd(dev)) {
> > > -            vmstate_register_with_alias_id(dev, -1, qdev_get_vmsd(dev), 
> > > dev,
> > > +            int instance_id = DEVICE_GET_CLASS(dev)->dev_get_instance_id
> > > +                ? DEVICE_GET_CLASS(dev)->dev_get_instance_id(dev)
> > > +                : -1;
> > > +
> > > +            vmstate_register_with_alias_id(dev, instance_id, 
> > > qdev_get_vmsd(dev), dev,
> > >                                             dev->instance_id_alias,
> > >                                             
> > > dev->alias_required_for_version);
> > >          }
> > > diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> > > index 2c97347..8ba82af 100644
> > > --- a/include/hw/qdev-core.h
> > > +++ b/include/hw/qdev-core.h
> > > @@ -139,6 +139,15 @@ typedef struct DeviceClass {
> > >      qdev_initfn init; /* TODO remove, once users are converted to 
> > > realize */
> > >      qdev_event exit; /* TODO remove, once users are converted to 
> > > unrealize */
> > >      const char *bus_type;
> > > +
> > > +    /* When this field is set, instead of using the device's QOM path,
> > > +     * SaveStateEntry's for devices will be identified using a 
> > > combination
> > > +     * of the corresponding VMSD name and an instance_id returned by this
> > > +     * function. This should only be necessary for situations where the
> > > +     * QOM path is anticipated to change and a more stable identifier is
> > > +     * desired to identify a device in the migration stream.
> > > +     */
> > > +    int (*dev_get_instance_id)(DeviceState *dev);
> > >  } DeviceClass;
> > >  
> > >  typedef struct NamedGPIOList NamedGPIOList;
> > > diff --git a/migration/savevm.c b/migration/savevm.c
> > > index 0363372..a95fff9 100644
> > > --- a/migration/savevm.c
> > > +++ b/migration/savevm.c
> > > @@ -556,7 +556,7 @@ int register_savevm_live(DeviceState *dev,
> > >          se->is_ram = 1;
> > >      }
> > >  
> > > -    if (dev) {
> > > +    if (dev && instance_id == -1) {
> > 
> > .. so, I'm not sure how this could lead to the observed symptoms, but
> > I did note that adding this condition makes another if within the
> > block redundant, because it also checks for instance_id == -1.  That
> > suggests that simply skipping the whole block in this case is probably
> > not the right thing to do.
> 
> I don't think this is the cause (verified the loading orders didn't
> change before/after these patches for the scenario in question), but
> that's a good catch:
> 
> there was previously se->compat->idstr/instance_id handling for
> cases where a we register a VMSD for a Device and provide an explicit
> instance_id. This registers an additional check so that we can map
> the se->compat values from an older QEMU to the current ID. This
> unfortunately doesn't work in the opposite direction so didn't suite
> our purposes for controlling identifiers in the outgoing stream, but
> by accidentally bypassing it completely there's now a chance old->new
> configurations might break for certain devices, so this needs to be
> re-thought a bit...


It turns out spapr_iommu already had a fairly straightforward solution.
Rather than relying on qdev to register it's vmstate, it does it
manually via:

  vmstate_register(DEVICE(tcet), tcet->liobn, &vmstate_spapr_tce_table,
                   tcet);

This avoids the qom-path-based code due to the fact that
qdev_get_dev_path() is NULL for bus-less Devices, so as a fallback the
VMSD gets identified via "spapr_iommu"/liobn. So using this approach,
we don't actually need patch 1 at all.

But it turns out, for this particular memory unplug issue, we don't
actually need DRC migration. The real issue is that the default DRC
state for coldplugged LMBs *should* correspond to the post-hotplug
state. Otherwise, unplug fails for cold-plugged LMBs *even if you
don't migrate beforehand*. The fix for that is much more
straight-forward, and the same approach we currently use for
coldplugged CPUs. Will send a patch shortly.

> 
> > 
> > >          char *id = qdev_get_dev_path(dev);
> > >          if (id) {
> > >              pstrcpy(se->idstr, sizeof(se->idstr), id);
> > > @@ -640,7 +640,7 @@ int vmstate_register_with_alias_id(DeviceState *dev, 
> > > int instance_id,
> > >      se->vmsd = vmsd;
> > >      se->alias_id = alias_id;
> > >  
> > > -    if (dev) {
> > > +    if (dev && instance_id == -1) {
> > >          char *id = qdev_get_dev_path(dev);
> > >          if (id) {
> > >              pstrcpy(se->idstr, sizeof(se->idstr), id);
> > 
> > -- 
> > David Gibson                    | I'll have my music baroque, and my code
> > david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
> >                                 | _way_ _around_!
> > http://www.ozlabs.org/~dgibson
> 
>

Re: [Qemu-devel] [PATCH for-2.8 1/3] migration: alternative way to set instance_id in SaveStateEntry

Reply via email to