On Mon, Oct 26, 2015 at 07:57:16PM +0800, Cao jin wrote: > Hi, > warning, there is a long story below O:-) > > On 10/26/2015 04:29 PM, Michael S. Tsirkin wrote: > >On Mon, Oct 26, 2015 at 11:29:18AM +0800, Cao jin wrote: > >>Enable pcie device multifunction hot, just ensure the function 0 > >>added last, then driver will got the notification to scan all the > >>function in the slot. > >> > >>Signed-off-by: Cao jin <caoj.f...@cn.fujitsu.com>
In my opinion, what's confusing you is that you keep thinking about ARI. ARI is just a way where PCI_SLOT(devfn) can be != 0 when you are actually just part of the same device. But with or without ARI, PCIE devices with upstream ports can only occupy slot 0. So let's check for that. > >>--- > >> hw/pci/pci.c | 31 ++++++++++++++++++++++++++++++- > >> hw/pci/pci_host.c | 13 +++++++++++-- > >> hw/pci/pcie.c | 18 +++++++++--------- > >> include/hw/pci/pci.h | 1 + > >> 4 files changed, 51 insertions(+), 12 deletions(-) > >> > >>diff --git a/hw/pci/pci.c b/hw/pci/pci.c > >>index b0bf540..6f43b12 100644 > >>--- a/hw/pci/pci.c > >>+++ b/hw/pci/pci.c > >>@@ -847,6 +847,9 @@ static PCIDevice *do_pci_register_device(PCIDevice > >>*pci_dev, PCIBus *bus, > >> PCIConfigWriteFunc *config_write = pc->config_write; > >> Error *local_err = NULL; > >> AddressSpace *dma_as; > >>+ DeviceState *dev = DEVICE(pci_dev); > >>+ > >>+ pci_dev->bus = bus; > >> > >> if (devfn < 0) { > >> for(devfn = bus->devfn_min ; devfn < ARRAY_SIZE(bus->devices); > >>@@ -864,9 +867,17 @@ static PCIDevice *do_pci_register_device(PCIDevice > >>*pci_dev, PCIBus *bus, > >> PCI_SLOT(devfn), PCI_FUNC(devfn), name, > >> bus->devices[devfn]->name); > >> return NULL; > >>+ } else if (dev->hotplugged && > >>+ pci_get_function_0(pci_dev)) { > >>+ error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s," > >>+ " new func %s cannot be exposed to guest.", > >>+ PCI_SLOT(devfn), > >>+ bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name, > >>+ name); > >>+ > >>+ return NULL; > >> } > >> > >>- pci_dev->bus = bus; > >> pci_dev->devfn = devfn; > >> dma_as = pci_device_iommu_address_space(pci_dev); > >> > >>@@ -2454,6 +2465,24 @@ void pci_bus_get_w64_range(PCIBus *bus, Range *range) > >> pci_for_each_device_under_bus(bus, pci_dev_get_w64, range); > >> } > >> > >>+/* ARI device function number range is 0-255, means has only 1 function0; > >>+ * while non-ARI device has 1 function0 in each slot. non-ARI device could > >>+ * be PCI or PCIe, and there is up to 32 slots for PCI */ > >>+PCIDevice *pci_get_function_0(PCIDevice *pci_dev) > >>+{ > >>+ PCIDevice *parent_dev; > >>+ > >>+ parent_dev = pci_bridge_get_device(pci_dev->bus); > >>+ if (pcie_cap_is_arifwd_enabled(parent_dev) && > >>+ pci_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI)) { > >>+ /* ARI enabled */ > >>+ return pci_dev->bus->devices[0]; > > > >That's wrong I think since software might enable ARI after hotplug. > > According to the spec, ARI feature is enabled only based on the following 2 > conditions: > 1. For an ARI Downstream Port, the capability is communicated through the > Device Capabilities 2 register. > 2. For an ARI Device, the capability is communicated through the ARI > Capability structure. > > also according to the driver code pci_configure_ari(), I think my > implementation does follows spec? > > And as you know, only ARI feature is enabled, we return > pci_dev->bus->devices[0] Yes but that's not the point. The point is whether a device can occupy slot != 0. > >And I'm not sure all functions must have the ARI capability. > > > > do you means there maybe the following condition: ARI forwarding bit is > enabled in downstream port, but the functions below, some have ARI > Capability structure while the others don`t. Shouldn`t we forbid the > condition? Because If this condition happens, I am not sure whether the > device could work normally. > > In the IMPLEMENTATION NOTE: ARI Forwarding Enable Being Set Inappropriately, > It seems the spec don`t want that complicated condition? While actually, > qemu calls pcie_cap_arifwd_init() in root port/downstream port > initialization first. > > >I don't see why don't you just go by spec: > > > > I do read the spec, and also referred the pcie driver code(see > pci_configure_ari()). I think it is my inaccurate understanding about > "upstream port" results in my implementation(I also consult PCISIG support > team for this question, see attachment). The concept of "upstream port" in > ARI device definition confuse me a lot. > > Talking about the definition of ARI device, I always thought the "upstream > port" should on the ARI device itself(like the figure I drew before), or > else why the definition add the words "with an upstream port"? It seems to > me that the emphasis is on "with an upstream port", and implies to me that > the non-ARI device doesn`t have an upstream port. Seeing their replies in > attachment, says that I am wrong at this point, both of them have an > upstream port.(their saying actually make me more confused at that time, but > now I think I am clear about this concept after reading your implementation > below) > > After reading your implementation, and the PCISIG support team replies > again, finally I figure out that, "upstream port" in ARI device definition > is just a port whose position is closer to root complex, and the point is, > the "upstream port" doesn`t need to exist on the ARI device itself, which > also means, take Figure 6-13 in PCIe spec 3.1, the Root Port A is the > "upstream port" for ARI Device X, and the Downstream Port D in Switch is the > "upstream port" for ARI Device Y. Now, do I understand it right? > > Hope I can get you understood from the long description above. If I still > don`t understand it right, please point it out, after all, it confused me > for a long time. > > >static > >bool pcie_has_upstream_port(PCIDevice *dev) > >{ > > PCIDevice *parent_dev = pci_bridge_get_device(pci_dev->bus); > > > > /* > > * Device associated with an upstream port. > > * As there are several types of these, it's easier to check the > > * parent device: upstream ports are always connected to > > * root or downstream ports. > > */ > > return parent_dev && > > pci_is_express(parent_dev) && > > parent_dev->exp.exp_cap && > > (pcie_cap_get_type(parent_dev) == PCI_EXP_TYPE_ROOT_PORT || > > pcie_cap_get_type(parent_dev) == PCI_EXP_TYPE_DOWNSTREAM); > >} > > > > Assume my understanding is right, which means both ARI and non-ARI device > have the upstream port(root port or downstream port), could the existence of > upstream port be the judgment condition? This tells us whether we are behind a port that can address devices in slot != 0. > > > >PCIDevice *pci_get_function_0(PCIDevice *pci_dev) > >{ > > if (pcie_has_upstream_port(dev)) { > > /* With an upstream PCIE port, we only support 1 device at slot 0 */ > > return pci_dev->bus->devices[0]; > > } else { > > /* Other bus types might support multiple devices at slots 0-31 */ > > return > > pci_dev->bus->devices[PCI_DEVFN(PCI_SLOT(pci_dev->devfn), 0)]; > > } > >} > > > > >>+ } else { > >>+ /* no ARI */ > >>+ return pci_dev->bus->devices[PCI_DEVFN(PCI_SLOT(pci_dev->devfn), > >>0)]; > >>+ } > >>+} > >>+ > >> static const TypeInfo pci_device_type_info = { > >> .name = TYPE_PCI_DEVICE, > >> .parent = TYPE_DEVICE, > >>diff --git a/hw/pci/pci_host.c b/hw/pci/pci_host.c > >>index 3e26f92..63d7d2f 100644 > >>--- a/hw/pci/pci_host.c > >>+++ b/hw/pci/pci_host.c > >>@@ -20,6 +20,7 @@ > >> > >> #include "hw/pci/pci.h" > >> #include "hw/pci/pci_host.h" > >>+#include "hw/pci/pci_bus.h" > >> #include "trace.h" > >> > >> /* debug PCI */ > >>@@ -75,7 +76,11 @@ void pci_data_write(PCIBus *s, uint32_t addr, uint32_t > >>val, int len) > >> PCIDevice *pci_dev = pci_dev_find_by_addr(s, addr); > >> uint32_t config_addr = addr & (PCI_CONFIG_SPACE_SIZE - 1); > >> > >>- if (!pci_dev) { > >>+ /* non-zero functions are only exposed when function 0 is present, > >>+ * allowing direct removal of unexposed functions. > >>+ */ > >>+ if (!pci_dev || > >>+ (pci_dev->qdev.hotplugged && !pci_get_function_0(pci_dev))) { > >> return; > >> } > >> > >>@@ -91,7 +96,11 @@ uint32_t pci_data_read(PCIBus *s, uint32_t addr, int len) > >> uint32_t config_addr = addr & (PCI_CONFIG_SPACE_SIZE - 1); > >> uint32_t val; > >> > >>- if (!pci_dev) { > >>+ /* non-zero functions are only exposed when function 0 is present, > >>+ * allowing direct removal of unexposed functions. > >>+ */ > >>+ if (!pci_dev || > >>+ (pci_dev->qdev.hotplugged && !pci_get_function_0(pci_dev))) { > >> return ~0x0; > >> } > >> > >>diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c > >>index b1adeaf..4ba9501 100644 > >>--- a/hw/pci/pcie.c > >>+++ b/hw/pci/pcie.c > >>@@ -249,16 +249,16 @@ void pcie_cap_slot_hotplug_cb(HotplugHandler > >>*hotplug_dev, DeviceState *dev, > >> return; > >> } > >> > >>- /* TODO: multifunction hot-plug. > >>- * Right now, only a device of function = 0 is allowed to be > >>- * hot plugged/unplugged. > >>+ /* To enable multifunction hot-plug, we just ensure the function > >>+ * 0 added last. When function 0 is added, we set the sltsta and > >>+ * inform OS via event notification. > >> */ > >>- assert(PCI_FUNC(pci_dev->devfn) == 0); > >>- > >>- pci_word_test_and_set_mask(exp_cap + PCI_EXP_SLTSTA, > >>- PCI_EXP_SLTSTA_PDS); > >>- pcie_cap_slot_event(PCI_DEVICE(hotplug_dev), > >>- PCI_EXP_HP_EV_PDC | PCI_EXP_HP_EV_ABP); > >>+ if (pci_get_function_0(pci_dev)) { > >>+ pci_word_test_and_set_mask(exp_cap + PCI_EXP_SLTSTA, > >>+ PCI_EXP_SLTSTA_PDS); > >>+ pcie_cap_slot_event(PCI_DEVICE(hotplug_dev), > >>+ PCI_EXP_HP_EV_PDC | PCI_EXP_HP_EV_ABP); > >>+ } > >> } > >> > >> static void pcie_unplug_device(PCIBus *bus, PCIDevice *dev, void *opaque) > >>diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h > >>index f5e7fd8..379b6e1 100644 > >>--- a/include/hw/pci/pci.h > >>+++ b/include/hw/pci/pci.h > >>@@ -397,6 +397,7 @@ void pci_for_each_bus_depth_first(PCIBus *bus, > >> void *(*begin)(PCIBus *bus, void > >> *parent_state), > >> void (*end)(PCIBus *bus, void *state), > >> void *parent_state); > >>+PCIDevice *pci_get_function_0(PCIDevice *pci_dev); > >> > >> /* Use this wrapper when specific scan order is not required. */ > >> static inline > >>-- > >>2.1.0 > >. > > > > -- > Yours Sincerely, > > Cao Jin