Re: [Qemu-devel] [PULL 37/47] nvdimm acpi: introduce fit buffer

Igor Mammedov Tue, 01 Nov 2016 07:00:07 -0700

On Tue, 1 Nov 2016 21:40:46 +0800
Xiao Guangrong <guangrong.x...@linux.intel.com> wrote:


> On 11/01/2016 06:35 PM, Igor Mammedov wrote:
> > On Tue, 1 Nov 2016 11:30:46 +0800
> > Xiao Guangrong <guangrong.x...@linux.intel.com> wrote:
> >  
> >> On 10/31/2016 07:09 PM, Igor Mammedov wrote:  
> >>> On Mon, 31 Oct 2016 17:52:24 +0800
> >>> Xiao Guangrong <guangrong.x...@linux.intel.com> wrote:
> >>>  
> >>>> On 10/31/2016 05:45 PM, Igor Mammedov wrote:  
> >>>>> On Sun, 30 Oct 2016 23:25:05 +0200
> >>>>> "Michael S. Tsirkin" <m...@redhat.com> wrote:
> >>>>>  
> >>>>>> From: Xiao Guangrong <guangrong.x...@linux.intel.com>
> >>>>>>
> >>>>>> The buffer is used to save the FIT info for all the presented nvdimm
> >>>>>> devices which is updated after the nvdimm device is plugged or
> >>>>>> unplugged. In the later patch, it will be used to construct NVDIMM
> >>>>>> ACPI _FIT method which reflects the presented nvdimm devices after
> >>>>>> nvdimm hotplug
> >>>>>>
> >>>>>> As FIT buffer can not completely mapped into guest address space,
> >>>>>> OSPM will exit to QEMU multiple times, however, there is the race
> >>>>>> condition - FIT may be changed during these multiple exits, so that
> >>>>>> some rules are introduced:
> >>>>>> 1) the user should hold the @lock to access the buffer and  
> >>> Could you explain why lock is needed?  
> >>
> >> Yes. These are two things need to be synced between QEMU and OSPM:
> >> - fit buffer. QEMU products it and VM will consume it at the same time.
> >> - dirty indicator. QEMU sets it and it will be cleared by OSPM, that means.
> >>    both sides will modify it.  
> >
> > I understand why 'dirty indicator' is necessary but
> > could you describe a concrete call flows in detail
> > that would cause concurrent access and need extra
> > lock protection.
> >
> > Note that there is global lock (dubbed BQL) in QEMU,
> > which is taken every time guest accesses IO port or
> > QMP/monitor control event happens.  
> 
> Ah. okay. If there is a lock to protect vm-exit handler and
> monitor handler, i think it is okay to drop the lock.
> 
> >  
> >>>>>> 2) mark @dirty whenever the buffer is updated.
> >>>>>>
> >>>>>> @dirty is cleared for the first time OSPM gets fit buffer, if
> >>>>>> dirty is detected in the later access, OSPM will restart the
> >>>>>> access
> >>>>>>
> >>>>>> As fit should be updated after nvdimm device is successfully realized
> >>>>>> so that a new hotplug callback, post_hotplug, is introduced
> >>>>>>
> >>>>>> Signed-off-by: Xiao Guangrong <guangrong.x...@linux.intel.com>
> >>>>>> Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
> >>>>>> Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
> >>>>>> ---
> >>>>>>  include/hw/hotplug.h    | 10 +++++++++
> >>>>>>  include/hw/mem/nvdimm.h | 26 +++++++++++++++++++++-
> >>>>>>  hw/acpi/nvdimm.c        | 59 
> >>>>>> +++++++++++++++++++++++++++++++++++--------------
> >>>>>>  hw/core/hotplug.c       | 11 +++++++++
> >>>>>>  hw/core/qdev.c          | 20 +++++++++++++----
> >>>>>>  hw/i386/acpi-build.c    |  2 +-
> >>>>>>  hw/i386/pc.c            | 19 ++++++++++++++++
> >>>>>>  7 files changed, 124 insertions(+), 23 deletions(-)
> >>>>>>
> >>>>>> diff --git a/include/hw/hotplug.h b/include/hw/hotplug.h
> >>>>>> index c0db869..10ca5b6 100644
> >>>>>> --- a/include/hw/hotplug.h
> >>>>>> +++ b/include/hw/hotplug.h
> >>>>>> @@ -47,6 +47,7 @@ typedef void (*hotplug_fn)(HotplugHandler 
> >>>>>> *plug_handler,
> >>>>>>   * @parent: Opaque parent interface.
> >>>>>>   * @pre_plug: pre plug callback called at start of 
> >>>>>> device.realize(true)
> >>>>>>   * @plug: plug callback called at end of device.realize(true).
> >>>>>> + * @post_pug: post plug callback called after device is successfully 
> >>>>>> plugged.  
> >>>>> this doesn't seem to be necessary,
> >>>>> why hotplug_handler_plug() isn't sufficient?  
> >>>>
> >>>> At the point of hotplug_handler_plug(), the device is not realize 
> >>>> (realized == 0),
> >>>> however, nvdimm_get_plugged_device_list() works on realized nvdimm 
> >>>> devices.  
> >>>
> >>> I suggest instead of adding redundant hook, to reuse 
> >>> hotplug_handler_plug()
> >>> which is called after device::realize() successfully completed and amend
> >>> nvdimm_plugged_device_list() as follow:
> >>>
> >>> diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> >>> index e486128..c6d8cbb 100644
> >>> --- a/hw/acpi/nvdimm.c
> >>> +++ b/hw/acpi/nvdimm.c
> >>> @@ -38,11 +38,7 @@ static int nvdimm_plugged_device_list(Object *obj, 
> >>> void *opaque)
> >>>      GSList **list = opaque;
> >>>
> >>>      if (object_dynamic_cast(obj, TYPE_NVDIMM)) {
> >>> -        DeviceState *dev = DEVICE(obj);
> >>> -
> >>> -        if (dev->realized) { /* only realized NVDIMMs matter */
> >>> -            *list = g_slist_append(*list, DEVICE(obj));
> >>> -        }
> >>> +        *list = g_slist_append(*list, obj);
> >>>      }
> >>>
> >>>      object_child_foreach(obj, nvdimm_plugged_device_list, opaque);
> >>>
> >>> that should count not only already present nvdimms but also the one that's
> >>> being hotplugged.  
> >>
> >> It is not good as the device which failed to initialize  
> > See device_set_realized()
> >         [...]
> >         if (dc->realize) {
> >             dc->realize(dev, &local_err);
> >         }
> >
> >         if (local_err != NULL) {
> >             goto fail;
> >         }
> >
> >         DEVICE_LISTENER_CALL(realize, Forward, dev);
> >
> >         if (hotplug_ctrl) {
> >             hotplug_handler_plug(hotplug_ctrl, dev, &local_err);
> >         }
> >
> > i.e. control reaches to hotplug_handler_plug() only if
> > call dc->realize(dev, &local_err) is successful.
> >  
> 
> I mean, for example. if there are two devices, the first one is failed to be
> realized, the second one is init-ed successfully, then can
> nvdimm_plugged_device_list() get these two devices?
> 
> Based the code of pc_dimm_built_list(), i guess yes.
nope,
read qdev_device_add() end result: 
 - on success device in QOM tree
 - on failure device is destroyed

> 
> >> or is not being plugged
> >> (consider two nvdimm devices directly appended in QEMU command line, when 
> >> we plug
> >> the first one, both nvdimm devices will be counted in this list) is also 
> >> counted in
> >> this list...  
> > nope,
> > qdev_device_add() is called sequentially (one time for each 
> > -device/device_add)
> > so nvdimm_plugged_device_list() sees only devices that's been added
> > by previous -device/device_add plus one extra device that's
> > being added by in progress qdev_device_add() call.
> >
> > so for "-device nvdimm,id=nv1 -device nvdimm,id=2" call sequence
> > would look like:
> > 1:
> >   qdev_device_add("nvdimm,id=nv1") {  
> >      -> set parent // device becomes visible to 
> > nvdimm_get_plugged_device_list()  
> >      // this is the only time where device->realized == false
> >      // could be observed, all other call chains would either
> >      // see device->realized == true or not see device at all  
> >      -> realize()
> >            -> device_set_realized()
> >               -> nvdimm->realize()
> >               -> hotplug_handler_plug()  
> >                       nvdimm_get_plugged_device_list()
> >                          // would return list with 1 item (nv1)
> >                          // where nv1->realized == false)
> >
> >   }
> > 2:
> >   qdev_device_add("nvdimm,id=nv2") {  
> >      -> set parent // device becomes visible to 
> > nvdimm_get_plugged_device_list()
> >      -> realize()
> >            -> device_set_realized()
> >               -> nvdimm->realize()
> >               -> hotplug_handler_plug()  
> >                       nvdimm_get_plugged_device_list()
> >                          // would return list with 2 items (nv1 and nv2)
> >                          // where nv1->realized == true and nv2->realized = 
> > false
> >   }
> >  
> 
> Okay.
>

Re: [Qemu-devel] [PULL 37/47] nvdimm acpi: introduce fit buffer

Reply via email to