Akihiko Odaki <akihiko.od...@daynix.com> writes: > On 2023/12/21 1:46, Zhao Liu wrote: >> Hi Markus, >> On Wed, Dec 20, 2023 at 08:53:21AM +0100, Markus Armbruster wrote: >>> Date: Wed, 20 Dec 2023 08:53:21 +0100 >>> From: Markus Armbruster <arm...@redhat.com> >>> Subject: Re: [PATCH v2] qdev: Report an error for machine without >>> HotplugHandler >>> >>> Akihiko Odaki <akihiko.od...@daynix.com> writes: >>> >>>> On 2023/12/18 23:02, Markus Armbruster wrote: >>>>> Akihiko Odaki <akihiko.od...@daynix.com> writes: >>>>> >>>>>> On 2023/12/11 15:51, Markus Armbruster wrote: >>>>>>> Akihiko Odaki <akihiko.od...@daynix.com> writes: >>>>>>> >>>>>>>> The HotplugHandler of the machine will be used when the parent bus does >>>>>>>> not exist, but the machine may not have one. Report an error in such a >>>>>>>> case instead of aborting. >>>>>>>> >>>>>>>> Fixes: 7716b8ca74 ("qdev: HotplugHandler: Add support for unplugging >>>>>>>> BUS-less devices") >>>>>>>> Signed-off-by: Akihiko Odaki <akihiko.od...@daynix.com> >>>>>>> >>>>>>> Do you have a reproducer for the crash? >>>>>>> >>>>>>>> --- >>>>>>>> Changes in v2: >>>>>>>> - Fixed indention. >>>>>>>> - Link to v1: >>>>>>>> https://lore.kernel.org/r/20231202-bus-v1-1-f7540e3a8...@daynix.com >>>>>>>> --- >>>>>>>> system/qdev-monitor.c | 13 ++++++++++--- >>>>>>>> 1 file changed, 10 insertions(+), 3 deletions(-) >>>>>>>> >>>>>>>> diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c >>>>>>>> index a13db763e5..5fe5d49c20 100644 >>>>>>>> --- a/system/qdev-monitor.c >>>>>>>> +++ b/system/qdev-monitor.c >>>>>>>> @@ -927,9 +927,16 @@ void qdev_unplug(DeviceState *dev, Error **errp) >>>>>>> void qdev_unplug(DeviceState *dev, Error **errp) >>>>>>> { >>>>>>> DeviceClass *dc = DEVICE_GET_CLASS(dev); >>>>>>> HotplugHandler *hotplug_ctrl; >>>>>>> HotplugHandlerClass *hdc; >>>>>>> Error *local_err = NULL; >>>>>>> if (qdev_unplug_blocked(dev, errp)) { >>>>>>> return; >>>>>>> } >>>>>>> if (dev->parent_bus && !qbus_is_hotpluggable(dev->parent_bus)) { >>>>>>> error_setg(errp, QERR_BUS_NO_HOTPLUG, >>>>>>> dev->parent_bus->name); >>>>>>> return; >>>>>>> } >>>>>>> if (!dc->hotpluggable) { >>>>>>> error_setg(errp, QERR_DEVICE_NO_HOTPLUG, >>>>>>> object_get_typename(OBJECT(dev))); >>>>>>> return; >>>>>>> } >>>>>>> if (!migration_is_idle() && >>>>>>> !dev->allow_unplug_during_migration) { >>>>>>> error_setg(errp, "device_del not allowed while migrating"); >>>>>>> return; >>>>>>> } >>>>>>> >>>>>>>> qdev_hot_removed = true; >>>>>>>> hotplug_ctrl = qdev_get_hotplug_handler(dev); >>>>>>>> - /* hotpluggable device MUST have HotplugHandler, if it doesn't >>>>>>>> - * then something is very wrong with it */ >>>>>>>> - g_assert(hotplug_ctrl); >>>>>>>> + if (!hotplug_ctrl) { >>>>>>>> + /* >>>>>>>> + * hotpluggable bus MUST have HotplugHandler, if it doesn't >>>>>>>> + * then something is very wrong with it >>>>>>>> + */ >>>>>>>> + assert(!dev->parent_bus); >>>>>>>> + >>>>>>>> + error_setg(errp, "The machine does not support hotplugging >>>>>>>> for a device without parent bus"); >>>>>>>> + return; >>>>>>>> + } >>>>>>> >>>>>>> Extended version of my question above: what are the devices where >>>>>>> qdev_get_hotplug_handler(dev) returns null here? >>>>>> >>>>>> Start a VM: qemu-system-aarch64 -M virt -nographic >>>>>> Run the following on its HMP: device_del /machine/unattached/device[0] >>>>>> >>>>>> It tries to unplug cortex-a15-arm-cpu and crashes. >>>>> >>>>> This device has no parent bus (dev->parent_bus is null), but is marked >>>>> hot-pluggable (dc->hotpluggable is true). Question for somebody >>>>> familiar with the hot-plug machinery: is this sane? >>>> >>>> Setting hotpluggable false for each device without bus_type gives the same >>>> effect, but is error-prone. >>> >>> Having hotpluggable = true when the device cannot be hot-plugged is >>> *wrong*. You might be able to paper over the wrongness so the code >>> works anyway, but nothing good can come out of lying to developers >>> trying to understand how the code works. >>> >>> Three ideas to avoid the lying: >>> >>> 1. default hotpluggable to bus_type != NULL. > > I don't have an idea to achieve that. Currently bus_type is set after > hotpluggable. > >>> >>> 2. assert(dc->bus_type || !dc->hotpluggable) in a suitable spot. > > It results in abortion and doesn't improve the situation.
Oh, it does! The abort leads us to all the places where we currently lie (by having dc->hotpluggable = true when it isn't), so we can fix them. >>> 3. Change the meaning of hotpluggable, and rename it to reflect its new >>> meaning. Requires a careful reading of its uses. I wouldn't go there. > > I don't have an idea for such a naming. > > So I'm stuck with the current proposal. It suppresses abortion at least. Any > alternative idea is welcome. > >>> >> What about 4 (or maybe 3.1) - droping this hotpluggable flag and just use a >> helper (like qbus) to check if device is hotpluggable? >> This removes the confusion of that flag and also reduces the number of >> configuration items for DeviceState that require developer attention. >> A simple helper is as follows: > > Some devices simply doesn't support hotplugging even if the bus supports. > virtio-gpu-pci doesn't support hotplugging because the display infrastructure > cannot handle hotplugging, for example. > > Regards, > Akihiko Odaki