Hi Donald, > -----Original Message----- > From: Donald Dutile <ddut...@redhat.com> > Sent: Friday, April 18, 2025 9:34 PM > To: Shameerali Kolothum Thodi > <shameerali.kolothum.th...@huawei.com>; Nicolin Chen > <nicol...@nvidia.com> > Cc: qemu-...@nongnu.org; qemu-devel@nongnu.org; > eric.au...@redhat.com; peter.mayd...@linaro.org; j...@nvidia.com; > berra...@redhat.com; nath...@nvidia.com; mo...@nvidia.com; > smost...@google.com; Linuxarm <linux...@huawei.com>; Wangzhou (B) > <wangzh...@hisilicon.com>; jiangkunkun <jiangkun...@huawei.com>; > Jonathan Cameron <jonathan.came...@huawei.com>; > zhangfei....@linaro.org > Subject: Re: [PATCH 2/5] hw/arm/virt-acpi-build: Update IORT for multiple > smmuv3 devices > > > > On 4/16/25 1:38 AM, Shameerali Kolothum Thodi wrote: > > > > > >> -----Original Message----- > >> From: Nicolin Chen <nicol...@nvidia.com> > >> Sent: Wednesday, April 16, 2025 5:19 AM > >> To: Shameerali Kolothum Thodi > <shameerali.kolothum.th...@huawei.com> > >> Cc: qemu-...@nongnu.org; qemu-devel@nongnu.org; > >> eric.au...@redhat.com; peter.mayd...@linaro.org; j...@nvidia.com; > >> ddut...@redhat.com; berra...@redhat.com; nath...@nvidia.com; > >> mo...@nvidia.com; smost...@google.com; Linuxarm > >> <linux...@huawei.com>; Wangzhou (B) <wangzh...@hisilicon.com>; > >> jiangkunkun <jiangkun...@huawei.com>; Jonathan Cameron > >> <jonathan.came...@huawei.com>; zhangfei....@linaro.org > >> Subject: Re: [PATCH 2/5] hw/arm/virt-acpi-build: Update IORT for > multiple > >> smmuv3 devices > >> > >> On Tue, Apr 15, 2025 at 09:11:01AM +0100, Shameer Kolothum wrote: > >>> +static int get_smmuv3_device(Object *obj, void *opaque) > >>> +{ > >>> + GArray *sdev_blob = opaque; > >>> + int min_bus, max_bus; > >>> + VirtMachineState *vms; > >>> + PlatformBusDevice *pbus; > >>> + SysBusDevice *sbdev; > >>> + SMMUv3Device sdev; > >>> + hwaddr base; > >>> + int irq; > >>> + PCIBus *bus; > >> > >> In a reverse christmas tree order? Or we could at least group > >> those similar types together. > > > > Yeah. Reverse will look better I guess. > >> > >>> + vms = VIRT_MACHINE(qdev_get_machine()); > >>> + pbus = PLATFORM_BUS_DEVICE(vms->platform_bus_dev); > >>> + sbdev = SYS_BUS_DEVICE(obj); > >>> + base = platform_bus_get_mmio_addr(pbus, sbdev, 0); > >>> + irq = platform_bus_get_irqn(pbus, sbdev, 0); > >>> + > >>> + base += vms->memmap[VIRT_PLATFORM_BUS].base; > >>> + irq += vms->irqmap[VIRT_PLATFORM_BUS]; > >>> + > >>> + pci_bus_range(bus, &min_bus, &max_bus); > >>> + sdev.smmu_idmap.input_base = min_bus << 8; > >>> + sdev.smmu_idmap.id_count = (max_bus - min_bus + 1) << 8; > >>> + sdev.base = base; > >>> + sdev.irq = irq + ARM_SPI_BASE; > >> > >> Hmm, these base/irq local variables don't look necessary. > >> > >>> + g_array_append_val(sdev_blob, sdev); > >>> + return 0; > >>> +} > >>> + > >>> /* > >>> * Input Output Remapping Table (IORT) > >>> * Conforms to "IO Remapping Table System Software on ARM > >> Platforms", > >>> @@ -275,25 +330,42 @@ static void > >>> build_iort(GArray *table_data, BIOSLinker *linker, VirtMachineState > >> *vms) > >>> { > >>> int i, nb_nodes, rc_mapping_count; > >>> - size_t node_size, smmu_offset = 0; > >>> + size_t node_size, *smmu_offset = NULL; > >>> AcpiIortIdMapping *idmap; > >>> + int num_smmus = 0; > >>> uint32_t id = 0; > >>> GArray *smmu_idmaps = g_array_new(false, true, > >> sizeof(AcpiIortIdMapping)); > >>> GArray *its_idmaps = g_array_new(false, true, > >> sizeof(AcpiIortIdMapping)); > >>> + GArray *smmuv3_devices = g_array_new(false, true, > >> sizeof(SMMUv3Device)); > >>> > >>> AcpiTable table = { .sig = "IORT", .rev = 3, .oem_id = vms->oem_id, > >>> .oem_table_id = vms->oem_table_id }; > >>> /* Table 2 The IORT */ > >>> acpi_table_begin(&table, table_data); > >>> > >>> - if (vms->iommu == VIRT_IOMMU_SMMUV3) { > >>> - AcpiIortIdMapping next_range = {0}; > >>> - > >>> + nb_nodes = 2; /* RC, ITS */ > >>> + if (vms->iommu == VIRT_IOMMU_SMMUV3_DEV) { > >>> + object_child_foreach_recursive(object_get_root(), > >>> + get_smmuv3_device, > >>> smmuv3_devices); > >>> + /* Sort the smmuv3-devices by smmu idmap input_base */ > >>> + g_array_sort(smmuv3_devices, smmuv3_dev_idmap_compare); > >>> + /* Fill smmu idmap from sorted smmuv3 devices array */ > >>> + for (i = 0; i < smmuv3_devices->len; i++) { > >>> + SMMUv3Device *s = &g_array_index(smmuv3_devices, > >> SMMUv3Device, i); > >>> + g_array_append_val(smmu_idmaps, s->smmu_idmap); > >>> + } > >>> + num_smmus = smmuv3_devices->len; > >>> + } else if (vms->iommu == VIRT_IOMMU_SMMUV3) { > >>> object_child_foreach_recursive(object_get_root(), > >>> iort_host_bridges, smmu_idmaps); > >>> > >>> /* Sort the smmu idmap by input_base */ > >>> g_array_sort(smmu_idmaps, iort_idmap_compare); > >> > >> VIRT_IOMMU_SMMUV3 seems to fit the struct SMMUv3Device very well, > >> given that it has base, irq, and smmu_idmaps too? > > > > One difference though is the legacy VIRT_IOMMU_SMMUV3 one is a > global > > Machine wide one nad can be associated with multiple PCIe root > complexes > > which will result in smmu_idmaps array. See the iort_host_bridges() fn. > > > Didn't quite follow this logic; shouldn't the iort be built for devices > tagged for a virt-iommu or dependent on a pcie rp (bus num/bus-num > range of an RP)? > Thus, it's just an IORT with one or more IORT nodes? > I could see how the current iort_host_bridge() may need a revision to work > with -device arm-smmu configs. > > Which would bring me back to my earlier question, and it's likely tied to > this IORT construction: > -- can you have both types defined in a machine model?
Let me try to explain. Suppose we have a virt machine with two PCIe RCs, the default pcie.0 and a pxb-pcie RC pcie.1. For legacy "iommu=smmuv3" device case, we can only have one SMMUv3 device per machine and all the PCIe RCs in the machine are by default gets attached to that SMMUv3. There is no provision to associate a specific RC to the legacy SMMUv3 (of course you can use bypass_iommu option to bypass the SMMU if you want). As per IORT specification each SMMUv3 device must have a corresponding SMMUv3 node, which contains an array of ID mappings. From the IO remapping table spec, https://developer.arm.com/documentation/den0049/latest/ "ID mappings represent the formula by which an ID from a source is converted to an ID in a destination. For example, for a root complex behind an SMMU, the RID originating from that root complex must be converted to a StreamID in the destination SMMU." So the IORT representation for above case is something like below, [SMMUv3 Node] -->id_maps for pcie.0 -->id_maps for pcie.1 But for the new arm-smmuv3-dev, we have to associate a specific RC with each SMMUV3 device, -device arm-smmuv3-dev,bus=pcie.0,id=smmuv3.1 \ .. -device arm-smmuv3-dev,bus=pcie.1,id=smmuv3.2 \ In this case, the IORT would look like: [smmuv3.1 Node)] -->id_maps for pcie.0 [smmuv3.2 Node)] -->id_maps for pcie.1 So basically it is not possible/is difficult(and also doesn't make much sense) to have both legacy SMMUv3 and new SMMUv3 device co-exist in a system as the legacy one covers all the RCs in the machine. > > >> > >> Maybe we could generalize the struct SMMUv3Device to store both > >> cases. Perhaps just SMMUv3AcpiInfo? And then .. > > > I didn't follow 'SMMUv3AcpiInfo' since I don't see that in current qemu tree; > or is the statement trying to say its a 'mere matter of the proper acpi info > generation'? I think Nicolin is suggesting to use a common struct 'SMMUv3AcpiInfo' for both legacy and new cases. > > > Could do. But then you have to have a smmu_idmaps array and then > check > > the length of it to cover legacy SMMUv3 case I guess. > > > Aren't they going to be different idmaps for different IORT nodes? Please see above explanation. Thanks, Shameer