> -----Original Message----- > From: Auger Eric [mailto:eric.au...@redhat.com] > Sent: Wednesday, December 06, 2017 3:00 PM > To: Shameerali Kolothum Thodi <shameerali.kolothum.th...@huawei.com>; > Alex Williamson <alex.william...@redhat.com> > Cc: peter.mayd...@linaro.org; qemu-devel@nongnu.org; Linuxarm > <linux...@huawei.com>; qemu-...@nongnu.org; Zhaoshenglong > <zhaoshengl...@huawei.com>; Zhuyijun <zhuyi...@huawei.com> > Subject: Re: [Qemu-devel] [RFC 1/5] hw/vfio: Add function for getting > reserved_region of device iommu group > > Hi Shameer, > > On 06/12/17 15:38, Shameerali Kolothum Thodi wrote: > > Hi Eric, > > > >> -----Original Message----- > >> From: Auger Eric [mailto:eric.au...@redhat.com] > >> Sent: Wednesday, December 06, 2017 2:01 PM > >> To: Shameerali Kolothum Thodi <shameerali.kolothum.th...@huawei.com>; > >> Alex Williamson <alex.william...@redhat.com> > >> Cc: peter.mayd...@linaro.org; qemu-devel@nongnu.org; Linuxarm > >> <linux...@huawei.com>; qemu-...@nongnu.org; Zhaoshenglong > >> <zhaoshengl...@huawei.com>; Zhuyijun <zhuyi...@huawei.com> > >> Subject: Re: [Qemu-devel] [RFC 1/5] hw/vfio: Add function for getting > >> reserved_region of device iommu group > >> > >> Hi Shameer, > >> > >> On 06/12/17 11:30, Shameerali Kolothum Thodi wrote: > >>> Hi Alex, > >>> > >>>> -----Original Message----- > >>>> From: Shameerali Kolothum Thodi > >>>> Sent: Monday, November 20, 2017 4:31 PM > >>>> To: 'Alex Williamson' <alex.william...@redhat.com> > >>>> Cc: eric.au...@redhat.com; Zhuyijun <zhuyi...@huawei.com>; qemu- > >>>> a...@nongnu.org; qemu-devel@nongnu.org; peter.mayd...@linaro.org; > >>>> Zhaoshenglong <zhaoshengl...@huawei.com>; Linuxarm > >>>> <linux...@huawei.com> > >>>> Subject: RE: [Qemu-devel] [RFC 1/5] hw/vfio: Add function for getting > >>>> reserved_region of device iommu group > >>> [...] > >>>>>>> And sysfs is a good interface if the user wants to use it to > >>>>>>> configure the VM in a way that's compatible with a device. For > >>>>>>> instance, in your case, a user could evaluate these reserved > >>>>>>> regions across all devices in a system, or even across an entire > >>>>>>> cluster, and instantiate the VM with a memory map compatible with > >>>>>>> hotplugging any of those evaluated devices (QEMU implementation of > >>>> allowing the user to do this TBD). > >>>>>>> Having the vfio device evaluate these reserved regions only helps > >>>>>>> in the cold-plug case. So the proposed solution is limited in > >>>>>>> scope and doesn't address similar needs on other platforms. There > >>>>>>> is value to verifying that a device's IOVA space is compatible > >>>>>>> with a VM memory map and modifying the memory map on cold-plug > or > >>>>>>> rejecting the device on hot-plug, but isn't that why we have an > >>>>>>> ioctl within vfio to expose information about the IOMMU? Why take > >>>>>>> the path of allowing QEMU to rummage through sysfs files outside > >>>>>>> of vfio, implying additional security and access concerns, rather > >>>>>>> than filling the gap within the vifo API? > >>>>>> > >>>>>> Thanks Alex for the explanation. > >>>>>> > >>>>>> I came across this patch[1] from Eric where he introduced the IOCTL > >>>>> interface to > >>>>>> retrieve the reserved regions. It looks like this can be reworked to > >>>>> accommodate > >>>>>> the above requirement. > >>>>> > >>>>> I don't think we need a new ioctl for this, nor do I think that > >>>>> describing the holes is the correct approach. The existing > >>>>> VFIO_IOMMU_GET_INFO ioctl can be extended to support capability > >>>>> chains, as we've done for VFIO_DEVICE_GET_REGION_INFO. > >>>> > >>>> Right, as far as I can see the above mentioned patch is doing exactly the > >> same, > >>>> extending the VFIO_IOMMU_GET_INFO ioctl with capability chain. > >>>> > >>>>> IMO, we should try to > >>>>> describe the available fixed IOVA regions which are available for > >>>>> mapping rather than describing various holes within the address space > >>>>> which are unavailable. The latter method always fails to describe the > >>>>> end of the mappable IOVA space and gets bogged down in trying to > >>>>> classify the types of holes that might exist. Thanks, > >>>> > >>> > >>> I was going through this and noticed that it is possible to have multiple > >>> iommu domains associated with a container. If that's true, is it safe to > >>> make the assumption that all the domains will have the same iova > >>> geometry or not? > >> To me the answer is no. > >> > >> There are several iommu domains "underneath" a single container. You > >> attach vfio groups to a container. Each of them is associated to an > >> iommu group and an iommu domain. See > vfio_iommu_type1_attach_group(). > >> > >> Besides, the reserved regions are per iommu group. > >> > > > > Thanks for your reply. Yes, container can have multiple groups(hence > multiple > > iommu domains) and reserved regions are per group. Hence while deciding > > the default supported iova geometry we have to go through all the domains > > in the container and select smallest aperture as the supported default iova > > range. > > > > Please find below snippet from a patch I am working on. Please > > let me know your thoughts on this. > > > > Thanks, > > Shameer > > > > -- >8 -- > > +static int vfio_build_iommu_iova_caps(struct vfio_iommu *iommu, > > + struct vfio_info_cap *caps) { > > + struct iommu_resv_region *resv, *resv_next; > > + struct vfio_iommu_iova *iova, *iova_next; > > + struct list_head group_resv_regions, vfio_iova_regions; > > + struct vfio_domain *domain; > > + struct vfio_group *g; > > + phys_addr_t start, end; > > + int ret = 0; > > + > > + domain = list_first_entry(&iommu->domain_list, > > + struct vfio_domain, next); > > + /* Get the default iova range supported */ > > + start = domain->domain->geometry.aperture_start; > > + end = domain->domain->geometry.aperture_end; > > > > This is where I am confused. I think instead I should go over > > the domain_list and select the smallest aperture as default > > iova range. > yes that's correct. I Just want to warn you Pierre is working on the > same topic. May be worth to sync together. > > [PATCH] vfio/iommu_type1: report the IOMMU aperture info > (https://patchwork.kernel.org/patch/10084655/) > > I think he plans to rework his series with capability chain too.
Thanks Eric for the pointer. I will sent out my patch as an RFC then and take it from there (without the default iova aperture changes, as I can see there are some discussions in Pierre's patch to solve that) Thanks, Shameer