Hi Alex, On 10/18/23 23:42, Alex Williamson wrote: > On Wed, 11 Oct 2023 19:52:29 +0200 > Eric Auger <eric.au...@redhat.com> wrote: > >> Now we retrieve the usable IOVA ranges from the host, >> we now the physical IOMMU aperture and we can remove > s/now/use/? > >> the assumption of 64b IOVA space when calling >> vfio_host_win_add(). >> >> This works fine in general but in case of an IOMMU memory >> region this becomes more tricky. For instance the virtio-iommu >> MR has a 64b aperture by default. If the physical IOMMU has a >> smaller aperture (typically the case for VTD), this means we >> would need to resize the IOMMU MR when this latter is linked >> to a container. However this happens on vfio_listener_region_add() >> when calling the IOMMU MR set_iova_ranges() callback and this >> would mean we would have a recursive call the >> vfio_listener_region_add(). This looks like a wrong usage of >> he memory API causing duplicate IOMMU MR notifier registration > s/he/the/ > >> for instance. >> >> Until we find a better solution, make sure the vfio_find_hostwin() >> is not called anymore for IOMMU region. >> >> Signed-off-by: Eric Auger <eric.au...@redhat.com> >> >> --- >> >> v2 -> v3: >> - take into account the avail IOVA range may not be existing >> - Create as many VFIOHostDMAWindow as valid IOVA ranges >> - rebase on top of vfio-next >> >> I have not found any working solution to the IOMMU MR resizing. >> So I can remove this patch or remove the check for IOMMU MR. Maybe >> this is an issue which can be handled separately? > Am I correct that the only thing we're solving here is the FIXME? > > Without this change we assume a 64-bit IOVA address space for the > "hostwin" and the previous patch 03/ already sets the usable IOVA range > for the IOMMU aperture. Kernel code will error on mappings outside of > the usable IOVA ranges regardless, so at best we're making the failure > occur earlier in QEMU rather than at the time of the DMA mapping. yes that's what the patch aims at > > I think the FIXME comment had assumed the hostwin support would be more > universal, but perhaps we're just doubling down on a relic of SPAPR > support that we can safely ignore, and at some point remove. It really > seems to serve the same purpose as the new iova_ranges and if it were > worthwhile to fail the range in QEMU before trying to map it in the > kernel, we could test it against iova_ranges directly. > > Unless this serves some purpose that I'm not spotting, maybe we should > drop this patch, consider hostwin to be SPAPR specific, and avoid > further entanglements with it here so that it can be more easily > removed. Thanks, I am fine dropping the patch esp because I failed at resizing the IOMMU MR on-the-fly and was forced to move vfio_find_hostwin().
Thank you for the review. Eric > > Alex > >> --- >> hw/vfio/common.c | 14 +++++++------- >> hw/vfio/container.c | 23 +++++++++++++++++------ >> 2 files changed, 24 insertions(+), 13 deletions(-) >> >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c >> index 9d804152ba..1447b6fdc4 100644 >> --- a/hw/vfio/common.c >> +++ b/hw/vfio/common.c >> @@ -654,13 +654,6 @@ static void vfio_listener_region_add(MemoryListener >> *listener, >> goto fail; >> } >> >> - hostwin = vfio_find_hostwin(container, iova, end); >> - if (!hostwin) { >> - error_setg(&err, "Container %p can't map guest IOVA region" >> - " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx, container, iova, >> end); >> - goto fail; >> - } >> - >> memory_region_ref(section->mr); >> >> if (memory_region_is_iommu(section->mr)) { >> @@ -720,6 +713,13 @@ static void vfio_listener_region_add(MemoryListener >> *listener, >> return; >> } >> >> + hostwin = vfio_find_hostwin(container, iova, end); >> + if (!hostwin) { >> + error_setg(&err, "Container %p can't map guest IOVA region" >> + " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx, container, iova, >> end); >> + goto fail; >> + } >> + >> /* Here we assume that memory_region_is_ram(section->mr)==true */ >> >> /* >> diff --git a/hw/vfio/container.c b/hw/vfio/container.c >> index 5122ff6d92..eb9d962881 100644 >> --- a/hw/vfio/container.c >> +++ b/hw/vfio/container.c >> @@ -693,12 +693,23 @@ static int vfio_connect_container(VFIOGroup *group, >> AddressSpace *as, >> vfio_get_iommu_info_migration(container, info); >> g_free(info); >> >> - /* >> - * FIXME: We should parse VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE >> - * information to get the actual window extent rather than assume >> - * a 64-bit IOVA address space. >> - */ >> - vfio_host_win_add(container, 0, (hwaddr)-1, container->pgsizes); >> + if (container->nr_iovas == -1) { >> + /* >> + * no available info on usable IOVA ranges, >> + * assume 64b IOVA space >> + */ >> + vfio_host_win_add(container, 0, (hwaddr)-1, container->pgsizes); >> + } else { >> + GList *l; >> + >> + g_assert(container->nr_iovas); >> + for (l = container->iova_ranges; l; l = l->next) { >> + Range *r = l->data; >> + >> + vfio_host_win_add(container, range_lob(r), range_upb(r), >> + container->pgsizes); >> + } >> + } >> >> break; >> }