>> Inside object_unparent(), the call flow of unrealize steps is defined. >> By moving the "real unplug" part into "do_unplug" and therefor >> essentially calling it when unrealizing, we could generalize this for >> all unplug handlers. >> I think, order of realization and therefore the order of hotplug handler >> calls is strictly defined already. Same applies to unrealization if we >> would factor the essential parts out into e.g. "do_unplug". That order >> is strictly encoded in device_set_realized() and bus_set_realized(). > > I don't see any benefits in adding do_unplug() in this case, > it would essentially replace hotplug_handler_unplug() in event origin > point with object_unparent() and renaming unplug() to do_unplug(). > > As result object_unparent() will start do more than unparenting and > destroying the device and a point where unplug originates becomes > poorly documented/lost for a reader among other object_unparent() calls. > > hotplug handlers are controller businesses and not the device one so > hiding (generalizing) it inside of device.realize() doesn't look > the right this to do, I'd rather keep these things separate as much > as possible.
As long as we find another way to make this work I am happy. What I propose here works (and in my view does not violate any concepts, it just extends hotunplug handlers to device hierarchies getting unplugged). But I also understand the potential problems you think we could have in the future. Let's see if we can make your suggestion work. > >>> If I remember right, the suggested and partially implemented idea >>> in one of your previous series was to override default hotplug >>> handler with a machine one for plugged in device [1][2]. >>> However impl. wasn't exactly what I've suggested since it matches >>> all memory-devices. >>> >>> 1) qdev: let machine hotplug handler to override bus hotplug handler >>> 2) pc: route all memory devices through the machine hotplug handler >>> >>> So lets reiterate, we have TYPE_VIRTIO_PMEM and TYPE_VIRTIO_PMEM_PCI >>> the former implements TYPE_MEMORY_DEVICE interface and the later is >>> a wrapper PCI/whatnot device shim. >>> So when you plug that composite device you'd get 2 independent >>> plug hooks called, which makes it unrelable/broken design. >> >> Can you elaborate why this is broken? I don't consider the >> realize/unrealize order broken, and that is where we plug into. But yes, >> we logically plug a device hierarchy and therefore get a separate >> hotplug handler calls. > > 1st: > > consider we have a composite device A that contains B explicitly > manged by A (un)realizefn(). Now if we use your model of independent > callbacks we have only following fixed plug flow possible: > a>realize() > ::device_set_realized() > a:realizefn() > b->realize() > ::device_set_realized() > b:realizefn() > hotplug_handler_plug(b) => b_hotplug_callback > ... > hotplug_handler_plug(a) => a_hotplug_callback > > that limits composite device wiring to external components to > the only possible order That why we have pre_plug, plug and post_plug handlers for I guess ... > 1st: b_hotplug_callback() + 2nd: a_hotplug_callback > and other way around for unplug() > > That however might not be order we need to do wiring/teardown though, > depending on composite device and controllers it might be necessary to > call callbacks in opposite order or mix both into one to do the wiring > correctly. And to do that we would need drop (move) b_hotplug_callback() > into a_hotplug_callback(), i.e. make it the way I've originally suggested. > > hotplug callback call flow might be different in child_bus case > (well, it's different in current code) and it might change in future > (ex: I'm looking into dropping hotplug_handler_post_plug() and > moving hotplug_handler_plug() to a later point to address the same > issue as commits 25e89788/8449bcf94 but without post_plug callback). .. however that sounds like a good idea, I was wondering the same why we actually need the post_plug handler. > > It's more robust for devices do not depend heavily on specific order > and define its plug sequence explicitly outside of device core. > This way it won't break apart when device core code is amended. > > 2nd: > from user point of view (and API) when composite device is created > we are dealing with 1 device (even if it's composed of many others internally, > it's not an external user business). The same should apply to hotplug > handlers. i.e. wire that device using whatever controllers are necessary > but do not jump through layers inside of device from external code > which hotplug handlers are. It somehow looks strange to have plug handler scattered all over device_set_realize(), while unplug handlers are at a completely different place and not involved in unrealize(). This makes one think that hotplugging a device hierarchy works, while unplugging does currently not work. > >>> My next question would be why TYPE_VIRTIO_PMEM_PCI can't implement >>> TYPE_MEMORY_DEVICE interface and TYPE_VIRTIO_PMEM be a simple VIRTIO >>> device without any hotplug hooks (so shim device would proxy all >>> memory-device logic to its child)? >>> /huh, then you don't need get_device_id() as well/ >> >> I had the same idea while going through different options. Then we would >> have to forward all calls directly to the child. We cannot reuse >> TYPE_MEMORY_DEVICE, so we would either need a new interface or define >> the functions we want manually for each such device. > for all I care parent device could alias out its memory-device api to a child > or just outright access/call child internal APIs/members to do the job > (that's what virtio devices often do), but that's the price to pay for > ability to change type of the end device. > >>> That way using [2] and [1 - modulo it should match only concrete type] >>> machine would be able to override hotplug handlers for TYPE_VIRTIO_PMEM_PCI >>> and explicitly call machine + pci hotplug handlers in necessary order. >>> >>> flow would look like: >>> [acpi|shcp|native pci-e eject]-> >>> hotplug_ctrl = qdev_get_hotplug_handler(dev); >>> hotplug_handler_unplug(hotplug_ctrl, dev, &local_err); -> >>> machine_unplug() >>> machine_virtio_pci_pmem_cb(): >>> // we now that's device has 2 stage hotplug handlers, >>> // so we can arrange hotplug sequence in necessary order >>> hotplug_ctrl2 = qdev_get_bus_hotplug_handler(dev); >>> >>> //then do unplug in whatever order that's correct, >>> // I'd assume tear down/stop PCI device first, flushing >>> // command virtio command queues and that unplug memory >>> itself. >>> hotplug_handler_unplug(hotplug_ctrl2, dev, &local_err); >>> memory_device_unplug() >>> >>> Similar logic applies to device_add/device_del paths, with a difference that >>> origin point would be monitor/qmp. >> >> Let's see. User calls device_del(). That triggers an unplug_request. For >> virtio-pmem, there is nothing to do. > virtio-pmem is not a concrete device though, it's just internal thingy > inside of concrete device, the actual virtio-pmem-pci device is associated > with a pci controller that notifies guest about unplug request using > whatever hotplug method was configured for the controller (shpc/native > pci-e/acpi). > >> eject hook is called by the guest. For now we do an object_unparent. >> This would now be wrong. We would have to call a proper hotplug handler >> chain (I guess that's the refactoring you mentioned above). > it works for typical pci devices but it won't work for composite > ones that need access to controllers/resources not provided by > it's direct parent. At minimum it should be object_unparent() wrapped > into unplug() callback callback set for the pci bus if we'd look for shallow > conversion. But I haven't looked in details... I will do that during the next weeks. I'll resend the unrelated memory device changes for now. > >>> Point is to have a single explicit callback chain that applies to a concrete >>> device type. That way if ever change an ordering of calling plug callbacks >>> in qdev core, the expected for a device callback sequence would still >>> remain in place ensuring that device (un)plugged as expected. >> >> I haven't tested yet if this will work, but I can give it a try. I >> learned that in QEMU things often seem easier than they actually are :) >> >>> >>> Also it should be easier to trace for a reader, than 2 disjoint callbacks of >>> composite device (which only know to author of that device and then only >>> till he/she remembers how that thing works). >> >> In my view it makes things slightly more complicated, because you have >> to follow a hotplug handler chain that plugs devices via proxy devices. >> (e.g. passing through TYPE_MEMORY_DEVICE calls to a child, and therefore >> essentially hotplugging the child), instead of only watching out for >> which device get's hotplugged and finding exactly one hotplug handler. >> Of course, for a device hierarchy, multiple devices get logically >> hotplugged. > it's child only from internal point of view (in virtio design it's > virtio interface to share logic between different transports), > users don't really care about it. > It's job of proxy to forward data between external/internal world, > unfortunately adds more boilerplate but that's how virtio has been > designed. > As for following explicitly defined hotplug handlers chain, > it's explict and relevant parts are close to each other so it's easier > to understand what's going on, than trying to figure out implicit > callbacks sequence and how they are related. > So to summarize, you clearly favor a hotplug chain on a single device over multiple hotplug handlers for separate devices in a device hierarchy. -- Thanks, David / dhildenb