On 10/04/16 18:10, Laine Stump wrote: > On 10/04/2016 11:40 AM, Laszlo Ersek wrote: >> On 10/04/16 16:59, Daniel P. Berrange wrote: >>> On Mon, Sep 05, 2016 at 06:24:48PM +0200, Laszlo Ersek wrote: >>>> On 09/01/16 15:22, Marcel Apfelbaum wrote: >>>>> +2.3 PCI only hierarchy >>>>> +====================== >>>>> +Legacy PCI devices can be plugged into pcie.0 as Integrated >>>>> Devices or >>>>> +into DMI-PCI bridge. PCI-PCI bridges can be plugged into DMI-PCI >>>>> bridges >>>>> +and can be nested until a depth of 6-7. DMI-BRIDGES should be plugged >>>>> +only into pcie.0 bus. >>>>> + >>>>> + pcie.0 bus >>>>> + ---------------------------------------------- >>>>> + | | >>>>> + ----------- ------------------ >>>>> + | PCI Dev | | DMI-PCI BRIDGE | >>>>> + ---------- ------------------ >>>>> + | | >>>>> + ----------- ------------------ >>>>> + | PCI Dev | | PCI-PCI Bridge | >>>>> + ----------- ------------------ >>>>> + | | >>>>> + ----------- ----------- >>>>> + | PCI Dev | | PCI Dev | >>>>> + ----------- ----------- >>>> >>>> Works for me, but I would again elaborate a little bit on keeping the >>>> hierarchy flat. >>>> >>>> First, in order to preserve compatibility with libvirt's current >>>> behavior, let's not plug a PCI device directly in to the DMI-PCI >>>> bridge, >>>> even if that's possible otherwise. Let's just say >>>> >>>> - there should be at most one DMI-PCI bridge (if a legacy PCI hierarchy >>>> is required), >>> >>> Why do you suggest this ? If the guest has multiple NUMA nodes >>> and you're creating a PXB for each NUMA node, then it looks valid >>> to want to have a DMI-PCI bridge attached to each PXB, so you can >>> have legacy PCI devices on each NUMA node, instead of putting them >>> all on the PCI bridge without NUMA affinity. >> >> You are right. I meant the above within one PCI Express root bus. >> >> Small correction to your wording though: you don't want to attach the >> DMI-PCI bridge to the PXB device, but to the extra root bus provided by >> the PXB. > > This made me realize something - the root bus on a pxb-pcie controller > has a single slot and that slot can accept either a pcie-root-port > (ioh3420) or a dmi-to-pci-bridge. If you want to have both express and > legacy PCI devices on the same NUMA node, then you would either need to > create one pxb-pcie for the pcie-root-port and another for the > dmi-to-pci-bridge, or you would need to put the pcie-root-port and > dmi-to-pci-bridge onto different functions of the single slot. Should > the latter work properly?
Yes, I expect so. (Famous last words? :)) > > >> >>> >>>> - only PCI-PCI bridges should be plugged into the DMI-PCI bridge, >>> >>> What's the rational for that, as opposed to plugging devices directly >>> into the DMI-PCI bridge which seems to work ? >> >> The rationale is that libvirt used to do it like this. > > > Nah, that's just the *result* of the rationale that we wanted the > devices to be hotpluggable. At some later date we learned the hotplug on > a pci-bridge device doesn't work on a Q35 machine anyway, so it was kind > of pointless (but we still do it because we hold out hope that hotplug > of legacy PCI devices into a pci-bridge on Q35 machines will work one day) > > >> And the rationale >> for *that* is that DMI-PCI bridges cannot accept hotplugged devices, >> while PCI-PCI bridges can. >> >> Technically nothing forbids (AFAICT) cold-plugging PCI devices into >> DMI-PCI bridges, but this document is expressly not just about technical >> constraints -- it's a policy document. We want to simplify / trim the >> supported PCI and PCI Express hierarchies as much as possible. >> >> All valid *high-level* topology goals should be permitted / covered one >> way or another by this document, but in as few ways as possible -- >> hopefully only one way. For example, if you read the rest of the thread, >> flat hierarchies are preferred to deeply nested hierarchies, because >> flat ones save on bus numbers > > Do they? Yes. Nesting implies bridges, and bridges take up bus numbers. For example, in a PCI Express switch, the upstream port of the switch consumes a bus number, with no practical usefulness. IIRC we collectively devised a flat pattern elsewhere in the thread where you could exhaust the 0..255 bus number space such that almost every bridge (= taking up a bus number) would also be capable of accepting a hot-plugged or cold-plugged PCI Express device. That is, practically no wasted bus numbers. Hm.... search this message for "population algorithm": https://www.mail-archive.com/qemu-devel@nongnu.org/msg394730.html and then Gerd's big improvement / simplification on it, with multifunction: https://www.mail-archive.com/qemu-devel@nongnu.org/msg395437.html In Gerd's scheme, you'd only need only one or two (I'm lazy to count exactly :)) PCI Express switches, to exhaust all bus numbers. Minimal waste due to upstream ports. Thanks Laszlo >> , are easier to setup and understand, >> probably perform better, and don't lose any generality for cold- or >> hotplug. >> >> Thanks >> Laszlo >> >