Re: [PATCH v8 0/4] pci hotplug tracking

Michael S. Tsirkin Thu, 02 Nov 2023 06:34:21 -0700

On Thu, Nov 02, 2023 at 04:28:43PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 02.11.23 15:12, Michael S. Tsirkin wrote:
> > On Thu, Nov 02, 2023 at 03:00:01PM +0300, Vladimir Sementsov-Ogievskiy 
> > wrote:
> > > On 02.11.23 14:31, Michael S. Tsirkin wrote:
> > > > On Thu, Oct 05, 2023 at 12:29:22PM +0300, Vladimir Sementsov-Ogievskiy 
> > > > wrote:
> > > > > Hi all!
> > > > > 
> > > > > Main thing this series does is DEVICE_ON event - a counter-part to
> > > > > DEVICE_DELETED. A guest-driven event that device is powered-on.
> > > > > Details are in patch 2. The new event is paried with corresponding
> > > > > command query-hotplug.
> > > > 
> > > > Several things questionable here:
> > > > 1. depending on guest activity you can get as many
> > > >      DEVICE_ON events as you like
> > > 
> > > No, I've made it so it may be sent only once per device
> > 
> > Maybe document that?
> 
> Right, my fault
> 
> > 
> > > > 2. it's just for shpc and native pcie - things are
> > > >      confusing enough for management, we should make sure
> > > >      it can work for all devices
> > > 
> > > Agree, I'm thinking about it
> > > 
> > > > 3. what about non hotpluggable devices? do we want the event for them?
> > > > 
> > > 
> > > I think, yes, especially if we make async=true|false flag for device_add, 
> > > so that successful device_add must be always followed by DEVICE_ON - like 
> > > device_del is followed by DEVICE_DELETED.
> > > 
> > > Maybe, to generalize, it should be called not DEVICE_ON (which mostly 
> > > relate to hotplug controller statuses) but DEVICE_ADDED - a full 
> > > counterpart for DEVICE_DELETED.
> > > 
> > > > 
> > > > I feel this needs actual motivation so we can judge what's the
> > > > right way to do it.
> > > 
> > > My first motivation for this series was the fact that successful 
> > > device_add doesn't guarantee that hard disk successfully hotplugged to 
> > > the guest. It relates to some problems with shpc/pcie hotplug we had in 
> > > the past, and they are mostly fixed. But still, for management tool it's 
> > > good to understand that all actions related to hotplug controller are 
> > > done and we have "green light".
> > 
> > what does "successfully" mean though? E.g. a bunch of guests will not
> > properly show you the device if the disk is not formatted properly.
> 
> Yes, I understand, that we may say only about "some degree of success".
> 
> But here is some physical sense still: DEVICE_ON indicates, that it's now 
> safe to call device_del. And calling device_del before DEVICE_ON is a kind of 
> unexpected behavior.
>


Is that really true? I really don't think we should introduce new types
of undefined behavior.


> > 
> > > 
> > > Recently new motivation come, as I described in my "ping" letter 
> > > <6bd19a07-5224-464d-b54d-1d738f5ba...@yandex-team.ru>, that we have a 
> > > performance degradation because of 7bed89958bfbf40df, which introduces 
> > > drain_call_rcu() in device_add, to make it more synchronous. So, my 
> > > suggestion is make it instead more asynchronous (probably with special 
> > > flag) and rely on DEVICE_ON event.
> > 
> > This one?
> > 
> > commit 7bed89958bfbf40df9ca681cefbdca63abdde39d
> > Author: Maxim Levitsky <mlevi...@redhat.com>
> > Date:   Tue Oct 6 14:38:58 2020 +0200
> > 
> >      device_core: use drain_call_rcu in in qmp_device_add
> >      Soon, a device removal might only happen on RCU callback execution.
> >      This is okay for device-del which provides a DEVICE_DELETED event,
> >      but not for the failure case of device-add.  To avoid changing
> >      monitor semantics, just drain all pending RCU callbacks on error.
> >      Signed-off-by: Maxim Levitsky <mlevi...@redhat.com>
> >      Suggested-by: Stefan Hajnoczi <stefa...@gmail.com>
> >      Reviewed-by: Stefan Hajnoczi <stefa...@redhat.com>
> >      Message-Id: <20200913160259.32145-4-mlevi...@redhat.com>
> >      [Don't use it in qmp_device_del. - Paolo]
> >      Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
> > 
> > diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
> > index e9b7228480..bcfb90a08f 100644
> > --- a/softmmu/qdev-monitor.c
> > +++ b/softmmu/qdev-monitor.c
> > @@ -803,6 +803,18 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, 
> > Error **errp)
> >           return;
> >       }
> >       dev = qdev_device_add(opts, errp);
> > +
> > +    /*
> > +     * Drain all pending RCU callbacks. This is done because
> > +     * some bus related operations can delay a device removal
> > +     * (in this case this can happen if device is added and then
> > +     * removed due to a configuration error)
> > +     * to a RCU callback, but user might expect that this interface
> > +     * will finish its job completely once qmp command returns result
> > +     * to the user
> > +     */
> > +    drain_call_rcu();
> > +
> >       if (!dev) {
> >           qemu_opts_del(opts);
> >           return;
> > 
> > 
> > 
> > So maybe just move drain_call_rcu under if (!dev) then and be done with
> > it?
> > 
> 
> Hmm, I read the commit message thinking that it saying about device removal 
> by mistake and actually want to say both about device_add and device_del.. 
> But I was wrong?
> 
> Hmm, it directly say "just drain all pending RCU callbacks on error", but 
> does that on success path as well.
> 
> Yes, moving drain_call_rcu makes sense for me, and will close the second 
> "motivation". I can make a patch.
> 
> -- 
> Best regards,
> Vladimir

Re: [PATCH v8 0/4] pci hotplug tracking

Reply via email to