Cc'ing Damien who is working on "multi-phase reset mechanism". On 5/27/19 9:52 AM, Markus Armbruster wrote: > Peter Maydell <peter.mayd...@linaro.org> writes: > >> On Fri, 24 May 2019 at 20:47, Christian Borntraeger >> <borntrae...@de.ibm.com> wrote: >>> While this patch is certainly ok, I find it disturbing that qdev devices >>> are being resetted, >>> but qom devices not. >> >> It's not a qdev-vs-QOM thing. Anything which is a DeviceState >> has a reset method, but only devices which are somewhere >> rooted in the bus-tree that starts with the "main system >> bus" (aka sysbus) get reset by the vl.c-registered "reset >> everything on the system bus". Devices which are SysBusDevice >> get auto-parented onto the sysbus, and so get reset. Devices >> like PCI devices or SCSI devices get put onto the PCI >> bus or the SCSI bus, and those buses are in turn children >> of some host-controller device which is on the sysbus, so >> they all get reset. The things that don't get reset are >> "orphan" devices which are neither (a) of a type that gets >> automatically parented onto a bus like SysBusDevice nor >> (b) put specifically onto some other bus. >> >> CPU objects are the other common thing that doesn't get >> reset 'automatically'. >> >> Suggestions for how to restructure reset so this doesn't >> happen are welcome... "reset follows the bus hierarchy" >> works well in some places but is a bit weird in others >> (for SoC containers and the like "follow the QOM >> hierarchy" would make more sense, but I have no idea >> how to usefully transition to a model where you could >> say "for these devices, follow QOM tree for reset" or >> what an API for that would look like). > > Here's a QOM composition tree for the ARM virt machine (-nodefaults > -device e1000) as visible in qom-fuse under /machine, with irq and > qemu:memory-region ommitted for brevity: > > machine virt-4.1-machine > +-- fw_cfg fw_cfg_mem > +-- peripheral container > +-- peripheral-anon container > | +-- device[0] e1000 > +-- unattached container > | +-- device[0] cortex-a15-arm-cpu > | +-- device[1] arm_gic > | +-- device[2] arm-gicv2m > | +-- device[3] pl011 > | +-- device[4] pl031 > | +-- device[5] gpex-pcihost > | | +-- pcie.0 PCIE > | | +-- gpex_root gpex-root > | +-- device[6] pl061 > | +-- device[7] gpio-key > | +-- device[8] virtio-mmio > | | +-- virtio-mmio-bus.0 virtio-mmio-bus > | . > | . more virtio-mmio > | . > | +-- device[39] virtio-mmio > | | +-- virtio-mmio-bus.31 virtio-mmio-bus > | +-- device[40] platform-bus-device > | +-- sysbus System > +-- virt.flash0 cfi.pflash01 > +-- virt.flash1 cfi.pflash01 > > Observations: > > * Some components of the machine are direct children of machine: fw_cfg, > virt.flash0, virt.flash1 > > * machine additionally has a few containers: peripheral, > peripheral-anon, unattached. > > * machine/peripheral and machine/peripheral-anon contain the -device > with and without ID, respectively. > > * machine/unattached contains everything else created by code without an > explicit parent device. Some (all?) of them should perhaps be direct > children of machine instead. > > Compare to the qdev tree shown by info qtree: > > bus: main-system-bus > type System > dev: platform-bus-device, id "platform-bus-device" > dev: fw_cfg_mem, id "" > dev: virtio-mmio, id "" > bus: virtio-mmio-bus.31 > type virtio-mmio-bus > ... more virtio-mmio > dev: virtio-mmio, id "" > bus: virtio-mmio-bus.0 > type virtio-mmio-bus > dev: gpio-key, id "" > dev: pl061, id "" > dev: gpex-pcihost, id "" > bus: pcie.0 > type PCIE > dev: e1000, id "" > dev: gpex-root, id "" > dev: pl031, id "" > dev: pl011, id "" > dev: arm-gicv2m, id "" > dev: arm_gic, id "" > dev: cfi.pflash01, id "" > dev: cfi.pflash01, id "" > > Observations: > > * Composition tree root machine's containers are not in the qtree. > > * Composition tree node cortex-a15-arm-cpu is not in the qtree. That's > because it's not a qdev (in QOM parlance: not a TYPE_DEVICE). > > * In the qtree, every other inner node is a qbus. These are *leaves* in > the composition tree. The qtree's vertex from qbus to qdev is a > *link* in the composition tree. > > Example: main-system-bus -> pl011 is > machine/unattached/sysbus/child[4] -> > ../../../machine/unattached/device[3]. > > Example: main-system-bus/gpex-pcihost/pcie.0 -> e1000 is > machine/unattached/device[5]/pcie.0//child[1] -> > ../../../../machine/peripheral-anon/device[0]. > > Now let me ramble a bit on reset. > > We could model the reset wiring explicitly: every QOM object that wants > to participate in reset has a reset input pin. We represent the wiring > as links. The reset links form a reset tree. > > Example: object virt-4.1-machine at machine gets a link reset[4] > pointing to its child object cfi.pflash01 at machine/virt.flash0. > > Example: object PCIE at machine/unattached/device[5]/pcie.0 gets a link > reset[1] pointing to object e1000 at > machine/peripheral-anon/device[0]. > > Note that reset[1] points to the same object as child[1]. > > Adding all these links in code would be tiresome. Enter defaults. > > If an object doesn't get its reset input connected, and > > * the parent is machine/peripheral or machine/peripheral-anon, default > to the object that links to it in the composition tree. Error when > there's more than one. > > Example: object e1000 at machine/peripheral-anon/device[0] defaults to > machine/unattached/device[5]/pcie.0. > > * the parent is machine/unattached, default to machine. > > Example: object gpex-pcihost at machine/unattached/device[5] defaults > to machine. > > * the parent is not one of these containers, default to its parent in > the composition tree. > > Example: object cfi.pflash01 at machine/virt.flash0 defaults to > machine. > > This leaves the order in which an object asserts its reset wires > unspecified. In physical hardware, these guys get asserted > simultaneously, and the various pieces of silicon at the other ends > reset in parallel. In software, we serialize. > > We could pick some order by numbering the reset links, say reset[0], > reset[1], ... > > Reset could then work by walking the reset tree, running device model > callbacks before and after walking the children. > > Cases may well exist where reset is more complicated than that. Perhaps > an object resets its children in the reset tree at different times > during its own reset. Such cases need to modelled in device model code. > That's another callback. > > This is quite possibly too naive to be practical. Hey, you asked :)
Thanks a lot for this extensive analysis :) I guess you indirectly answered to some of the issues Peter pointed in https://lists.gnu.org/archive/html/qemu-devel/2019-05/msg03667.html Markus, if your bandwidth allows it, you might want to have a look at it :)