[RFC] SPAPR-PCI Hotplug Support in Qemu Background: ppc64 has a unique bus structure for PCI slots: each slot is connected to its PHB by a pci switch. This is true in some IBM hardware as well as paravirtual hardware (PAPR).
SLOF firmware normally scans the hardware bus and creates the correct slot/PCI switch -complex in the open firmware device tree. It also configures the slot and PCI switch (BARs, etc.) For devices set up by platform firmware, each PCI device is attached to its PHB and correctly configured. For Linux hot-plugged devices running under PowerVM today, each device is created with a PCI switch hanging off the dev->subordinate pointer. (PowerVM gets this info from the open firmware device tree in rtas.) Problem: The Qemu hot-plug path doesn't anticipate a PCI switch being attached to every PHB slot. When hot-plugging a device, Qemu qdev creates the device, which allows the device to initialize itself. Qemu then passes this initialized device to the ppc PHB via the hot-plug path.[1] The current ppc hot-plug code then creates a device tree node for the device [2], and allocates resources (BARs etc) for the new device. [3] The ppc64 kernel expects each hot-plugged PCI device structure to point to a subordinate bus dev->subordinate. This assumption is held throughout the ppc PCI code, and there are numerous opportunities for panics when the device gets passed to a kernel routine with a subordinate pointer. [4] Proposed Solutions: (1) Create and hook an inert PCI switch to every hot-plugged PCI device in Qemu. (a) After the device has initialized itself, at hot-plug time, create a new PCI switch, configure the switch, allocate BARs, and attach the switch to the hot-plugged devices (dev->subordinate). (b) create a new device tree node that begins with the PCI switch and the parent of the hot-plugged device. Add the PCI switch/device complex to the device tree under the PHB. (2) Add each hot-plugged PCI device to its own complex of PHB (Processor Host Bus) and PCI switch. Simplify (1) by creating a new PHB for each hot-plugged device. (a) At PHB creation time, create a PCI switch device node for each PHB slot. (b) At hot-plug time, create and configure a new PHB and add the hot-plugged device to one of the slots. Configure and allocate resources as normally. Comments: The current code has only one PHB. We know we need to support more than one PHB ultimately. Solution #2 is consistent with this approach. [1] https://github.com/mdroth/qemu/blob/spapr-pci-hotplug/hw/ppc/spapr_pci.c [2] ibm,rtas_configure_connector: https://github.com/mdroth/qemu/blob/spapr-pci-hotplug/hw/ppc/spapr_pci.c#L575 [3] spapr_phb_add_pci_dt https://github.com/mdroth/qemu/blob/spapr-pci-hotplug/hw/ppc/spapr_pci.c#L900 [4] dlpar_pci_add_bus http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/drivers/pci/hotplug/rpadlpar_core.c?id=8bf3379a74bc9132751bfa685bad2da318fd59d7#n165 -- Mike Day | + 1 919 371-8786 | ncm...@ncultra.org "Endurance is a Virtue"