On Fri, Jul 26, 2019 at 06:55:53PM -0600, Alex Williamson wrote: > When we account for DMA aliases in the PCI address space, we can no > longer use a single IVHD entry in the IVRS covering all devices. We > instead need to walk the PCI bus and create alias ranges when we find > a conventional bus. These alias ranges cannot overlap with a "Select > All" range (as currently implemented), so we also need to enumerate > each device with IVHD entries. > > Importantly, the IVHD entries used here include a Device ID, which is > simply the PCI BDF (Bus/Device/Function). The guest firmware is > responsible for programming bus numbers, so the final revision of this > table depends on the update mechanism (acpi_build_update) to be called > after guest PCI enumeration.
Ouch... so the ACPI build procedure is after those guest PCI code! Could I ask how do you find this? :) It seems much easier for sure this way... This looks very nice to me already, though I still have got a few questions, please see below. [...] > + if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) { > + PCIBus *sec_bus = pci_bridge_get_sec_bus(PCI_BRIDGE(dev)); > + uint8_t sec = pci_bus_num(sec_bus); > + uint8_t sub = dev->config[PCI_SUBORDINATE_BUS]; > + > + if (pci_bus_is_express(sec_bus)) { > + /* > + * Walk the bus if there are subordinates, otherwise use a range > + * to cover an entire leaf bus. We could potentially also use a > + * range for traversed buses, but we'd need to take care not to > + * create both Select and Range entries covering the same device. > + * This is easier and potentially more compact. > + * > + * An example bare metal system seems to use Select entries for > + * root ports without a slot (ie. built-ins) and Range entries > + * when there is a slot. The same system also only hard-codes > + * the alias range for an onboard PCIe-to-PCI bridge, apparently > + * making no effort to support nested bridges. We attempt to > + * be more thorough here. > + */ > + if (sec == sub) { /* leaf bus */ > + /* "Start of Range" IVHD entry, type 0x3 */ > + entry = PCI_BUILD_BDF(sec, PCI_DEVFN(0, 0)) << 8 | 0x3; > + build_append_int_noprefix(table_data, entry, 4); > + /* "End of Range" IVHD entry, type 0x4 */ > + entry = PCI_BUILD_BDF(sub, PCI_DEVFN(31, 7)) << 8 | 0x4; > + build_append_int_noprefix(table_data, entry, 4); > + } else { > + pci_for_each_device(sec_bus, sec, insert_ivhd, table_data); > + } > + } else { > + /* > + * If the secondary bus is conventional, then we need to create > an > + * Alias range for everything downstream. The range covers the > + * first devfn on the secondary bus to the last devfn on the > + * subordinate bus. The alias target depends on legacy versus > + * express bridges, just as in pci_device_iommu_address_space(). > + * DeviceIDa vs DeviceIDb as per the AMD IOMMU spec. > + */ > + uint16_t dev_id_a, dev_id_b; > + > + dev_id_a = PCI_BUILD_BDF(sec, PCI_DEVFN(0, 0)); > + > + if (pci_is_express(dev) && > + pcie_cap_get_type(dev) == PCI_EXP_TYPE_PCI_BRIDGE) { > + dev_id_b = dev_id_a; > + } else { > + dev_id_b = PCI_BUILD_BDF(pci_bus_num(bus), dev->devfn); > + } > + > + /* "Alias Start of Range" IVHD entry, type 0x43, 8 bytes */ > + build_append_int_noprefix(table_data, dev_id_a << 8 | 0x43, 4); > + build_append_int_noprefix(table_data, dev_id_b << 8 | 0x0, 4); > + > + /* "End of Range" IVHD entry, type 0x4 */ > + entry = PCI_BUILD_BDF(sub, PCI_DEVFN(31, 7)) << 8 | 0x4; > + build_append_int_noprefix(table_data, entry, 4); > + } We've implmented the similar logic for multiple times: - When we want to do DMA (pci_requester_id) - When we want to fetch the DMA address space (the previous patch) - When we fill in the AMD ACPI table (this patch) Do you think we can generalize them somehow? I'm thinking how about we directly fetch RID in the 2nd/3rd use case using pci_requester_id() (which existed already) and simply use it? [...] > + /* > + * A PCI bus walk, for each PCI host bridge, is necessary to create a > + * complete set of IVHD entries. Do this into a separate blob so that we > + * can calculate the total IVRS table length here and then append the new > + * blob further below. Fall back to an entry covering all devices, which > + * is sufficient when no aliases are present. > + */ > + object_child_foreach_recursive(object_get_root(), > + ivrs_host_bridges, ivhd_blob); > + > + if (!ivhd_blob->len) { > + /* > + * Type 1 device entry reporting all devices > + * These are 4-byte device entries currently reporting the range of > + * Refer to Spec - Table 95:IVHD Device Entry Type Codes(4-byte) > + */ > + build_append_int_noprefix(ivhd_blob, 0x0000001, 4); > + } Is there a real use case for ivhd_blob->len==0? Thanks, -- Peter Xu