On 22/07/2024 18:15, Cédric Le Goater wrote: > On 7/22/24 19:04, Cédric Le Goater wrote: >> On 7/22/24 18:29, Joao Martins wrote: >>> On 22/07/2024 16:58, Cédric Le Goater wrote: >>>> On 7/22/24 17:42, Joao Martins wrote: >>>>> On 22/07/2024 16:13, Cédric Le Goater wrote: >>>>>> On 7/22/24 17:01, Joao Martins wrote: >>>>>>> On 22/07/2024 15:53, Cédric Le Goater wrote: >>>>>>>> On 7/19/24 19:26, Joao Martins wrote: >>>>>>>>> On 19/07/2024 15:24, Joao Martins wrote: >>>>>>>>>> On 19/07/2024 15:17, Cédric Le Goater wrote: >>>>>>>>>>> On 7/19/24 14:05, Joao Martins wrote: >>>>>>>>>>>> By default VFIO migration is set to auto, which will support live >>>>>>>>>>>> migration if the migration capability is set *and* also dirty page >>>>>>>>>>>> tracking is supported. >>>>>>>>>>>> >>>>>>>>>>>> For testing purposes one can force enable without dirty page >>>>>>>>>>>> tracking >>>>>>>>>>>> via enable-migration=on, but that option is generally left for >>>>>>>>>>>> testing >>>>>>>>>>>> purposes. >>>>>>>>>>>> >>>>>>>>>>>> So starting with IOMMU dirty tracking it can use to accomodate the >>>>>>>>>>>> lack of >>>>>>>>>>>> VF dirty page tracking allowing us to minimize the VF requirements >>>>>>>>>>>> for >>>>>>>>>>>> migration and thus enabling migration by default for those too. >>>>>>>>>>>> >>>>>>>>>>>> While at it change the error messages to mention IOMMU dirty >>>>>>>>>>>> tracking as >>>>>>>>>>>> well. >>>>>>>>>>>> >>>>>>>>>>>> Signed-off-by: Joao Martins <joao.m.mart...@oracle.com> >>>>>>>>>>>> --- >>>>>>>>>>>> include/hw/vfio/vfio-common.h | 1 + >>>>>>>>>>>> hw/vfio/iommufd.c | 2 +- >>>>>>>>>>>> hw/vfio/migration.c | 11 ++++++----- >>>>>>>>>>>> 3 files changed, 8 insertions(+), 6 deletions(-) >>>>>>>>>>>> >>>>>>>>>>>> diff --git a/include/hw/vfio/vfio-common.h >>>>>>>>>>>> b/include/hw/vfio/vfio-common.h >>>>>>>>>>>> index 7e530c7869dc..00b9e933449e 100644 >>>>>>>>>>>> --- a/include/hw/vfio/vfio-common.h >>>>>>>>>>>> +++ b/include/hw/vfio/vfio-common.h >>>>>>>>>>>> @@ -299,6 +299,7 @@ int vfio_devices_query_dirty_bitmap(const >>>>>>>>>>>> VFIOContainerBase *bcontainer, >>>>>>>>>>>> VFIOBitmap *vbmap, hwaddr iova, hwaddr size, >>>>>>>>>>>> Error >>>>>>>>>>>> **errp); >>>>>>>>>>>> int vfio_get_dirty_bitmap(const VFIOContainerBase >>>>>>>>>>>> *bcontainer, >>>>>>>>>>>> uint64_t >>>>>>>>>>>> iova, >>>>>>>>>>>> uint64_t size, ram_addr_t ram_addr, >>>>>>>>>>>> Error >>>>>>>>>>>> **errp); >>>>>>>>>>>> +bool iommufd_hwpt_dirty_tracking(VFIOIOASHwpt *hwpt); >>>>>>>>>>>> /* Returns 0 on success, or a negative errno. */ >>>>>>>>>>>> bool vfio_device_get_name(VFIODevice *vbasedev, Error >>>>>>>>>>>> **errp); >>>>>>>>>>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c >>>>>>>>>>>> index 7dd5d43ce06a..a998e8578552 100644 >>>>>>>>>>>> --- a/hw/vfio/iommufd.c >>>>>>>>>>>> +++ b/hw/vfio/iommufd.c >>>>>>>>>>>> @@ -111,7 +111,7 @@ static void >>>>>>>>>>>> iommufd_cdev_unbind_and_disconnect(VFIODevice >>>>>>>>>>>> *vbasedev) >>>>>>>>>>>> iommufd_backend_disconnect(vbasedev->iommufd); >>>>>>>>>>>> } >>>>>>>>>>>> -static bool iommufd_hwpt_dirty_tracking(VFIOIOASHwpt *hwpt) >>>>>>>>>>>> +bool iommufd_hwpt_dirty_tracking(VFIOIOASHwpt *hwpt) >>>>>>>>>>>> { >>>>>>>>>>>> return hwpt && hwpt->hwpt_flags & >>>>>>>>>>>> IOMMU_HWPT_ALLOC_DIRTY_TRACKING; >>>>>>>>>>>> } >>>>>>>>>>>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c >>>>>>>>>>>> index 34d4be2ce1b1..63ffa46c9652 100644 >>>>>>>>>>>> --- a/hw/vfio/migration.c >>>>>>>>>>>> +++ b/hw/vfio/migration.c >>>>>>>>>>>> @@ -1036,16 +1036,17 @@ bool vfio_migration_realize(VFIODevice >>>>>>>>>>>> *vbasedev, >>>>>>>>>>>> Error **errp) >>>>>>>>>>>> return !vfio_block_migration(vbasedev, err, errp); >>>>>>>>>>>> } >>>>>>>>>>>> - if (!vbasedev->dirty_pages_supported) { >>>>>>>>>>>> + if (!vbasedev->dirty_pages_supported && >>>>>>>>>>>> + !iommufd_hwpt_dirty_tracking(vbasedev->hwpt)) { >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Some platforms do not have IOMMUFD support and this call will need >>>>>>>>>>> some kind of abstract wrapper to reflect dirty tracking support in >>>>>>>>>>> the IOMMU backend. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> This was actually on purpose because only IOMMUFD presents a view of >>>>>>>>>> hardware >>>>>>>>>> whereas type1 supporting dirty page tracking is not used as means to >>>>>>>>>> 'migration >>>>>>>>>> is supported'. >>>>>>>>>> >>>>>>>>>> The hwpt is nil in type1 and the helper checks that, so it should >>>>>>>>>> return >>>>>>>>>> false. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Oh wait, maybe you're talking about CONFIG_IOMMUFD=n which I totally >>>>>>>>> didn't >>>>>>>>> consider. Maybe this would be a elegant way to address it? Looks to >>>>>>>>> pass my >>>>>>>>> build with CONFIG_IOMMUFD=n >>>>>>>>> >>>>>>>>> diff --git a/include/hw/vfio/vfio-common.h >>>>>>>>> b/include/hw/vfio/vfio-common.h >>>>>>>>> index 61dd48e79b71..422ad4a5bdd1 100644 >>>>>>>>> --- a/include/hw/vfio/vfio-common.h >>>>>>>>> +++ b/include/hw/vfio/vfio-common.h >>>>>>>>> @@ -300,7 +300,14 @@ int vfio_devices_query_dirty_bitmap(const >>>>>>>>> VFIOContainerBase >>>>>>>>> *bcontainer, >>>>>>>>> VFIOBitmap *vbmap, hwaddr iova, hwaddr size, >>>>>>>>> Error >>>>>>>>> **errp); >>>>>>>>> int vfio_get_dirty_bitmap(const VFIOContainerBase *bcontainer, >>>>>>>>> uint64_t >>>>>>>>> iova, >>>>>>>>> uint64_t size, ram_addr_t ram_addr, >>>>>>>>> Error >>>>>>>>> **errp); >>>>>>>>> +#ifdef CONFIG_IOMMUFD >>>>>>>>> bool iommufd_hwpt_dirty_tracking(VFIOIOASHwpt *hwpt); >>>>>>>>> +#else >>>>>>>>> +static inline bool iommufd_hwpt_dirty_tracking(VFIOIOASHwpt *hwpt) >>>>>>>>> +{ >>>>>>>>> + return false; >>>>>>>>> +} >>>>>>>>> +#endif >>>>>>>>> >>>>>>>>> /* Returns 0 on success, or a negative errno. */ >>>>>>>>> bool vfio_device_get_name(VFIODevice *vbasedev, Error **errp); >>>>>>>>> >>>>>>>> >>>>>>>> hmm, no. You will need to introduce a new Host IOMMU device capability, >>>>>>>> something like : >>>>>>>> >>>>>>>> HOST_IOMMU_DEVICE_CAP_DIRTY_TRACKING, >>>>>>>> >>>>>>>> Then, introduce an helper routine to check the capability : >>>>>>>> >>>>>>>> return hiodc->get_cap( ... >>>>>>>> HOST_IOMMU_DEVICE_CAP_DIRTY_TRACKING...) >>>>>>>> and replace the iommufd_hwpt_dirty_tracking call with it. >>>>>>>> >>>>>>>> Yeah I know, it's cumbersome but it's cleaner ! >>>>>>>> >>>>>>> >>>>>>> Funny you mention it, because that's what I did in v3: >>>>>>> >>>>>>> https://lore.kernel.org/qemu-devel/20240708143420.16953-9-joao.m.mart...@oracle.com/ >>>>>>> >>>>>>> But it was suggested to drop (I am assuming to avoid complexity) >>>>>> >>>>>> my bad if I did :/ >>>>>> >>>>> >>>>> No worries it is all part of review -- I think Zhenzhong proposed with >>>>> good >>>>> intentions, and I probably didn't think too hard about the consequences on >>>>> layering with the HIOD. >>>>> >>>>>> we will need an helper such as : >>>>>> >>>>>> bool vfio_device_dirty_tracking(VFIODevice *vbasedev) >>>>>> { >>>>>> HostIOMMUDevice *hiod = vbasedev->hiod ; >>>>>> HostIOMMUDeviceClass *hiodc = HOST_IOMMU_DEVICE_GET_CLASS(hiod); >>>>>> >>>>>> return hiodc->get_cap && >>>>>> hiodc->get_cap(hiod, HOST_IOMMU_DEVICE_CAP_DIRTY_TRACKING, >>>>>> NULL) >>>>>> == 1; >>>>>> } >>>>>> >>>>>> and something like, >>>>>> >>>>>> static int hiod_iommufd_vfio_get_cap(HostIOMMUDevice *hiod, int cap, >>>>>> Error **errp) >>>>>> { >>>>>> switch (cap) { >>>>>> case HOST_IOMMU_DEVICE_CAP_DIRTY_TRACKING: >>>>>> return !!(hiod->caps.hw_caps & IOMMU_HW_CAP_DIRTY_TRACKING); >>>>>> default: >>>>>> error_setg(errp, "%s: unsupported capability %x", hiod->name, >>>>>> cap); >>>>>> return -EINVAL; >>>>>> } >>>>>> } >>>>>> >>>>>> Feel free to propose your own implementation, >>>>>> >>>>> >>>>> Actually it's close to what I had in v3 link, except the new helper (the >>>>> name >>>>> vfio_device_dirty_tracking is a bit misleading I would call it >>>>> vfio_device_iommu_dirty_tracking) >>>> >>>> Let's call it vfio_device_iommu_dirty_tracking. >>>> >>> >>> I thinking about this and I am not that sure it makes sense. That is the >>> .get_cap() stuff. >>> >>> Using the hw_caps is only useful when choosing hwpt_flags, then the only >>> thing >>> that matters for patch 12 is after the device is attached ... hence we gotta >>> look at hwpt_flags. That ultimately is what tells if dirty tracking can be >>> done >>> in the device pagetable. >>> >>> I can expand hiod_iommufd_vfio_get_cap() to return the hwpt flags, but it >>> feels >>> just as hacky given that I am testing its enablement of the hardware >>> pagetable >>> (HWPT), and not asking a HIOD capability. >> >> arf. yes. >> >>> e.g. hiod_iommufd_vfio_get_cap would make more sense in patch 9 for the >>> attach_device() flow[*], but not for vfio_migration_realize() flow. >>> >>> [*] though feels unneeded as we only have a local callsite, not external >>> user so >>> far. >>> >>> Which would technically make v5.1 patch a more correct right check, perhaps >>> with >>> better layering/naming. >> >> The quick fix (plan B if needed) would be : >> >> @@ -1038,8 +1038,11 @@ bool vfio_migration_realize(VFIODevice * >> } >> >> if ((!vbasedev->dirty_pages_supported || >> - vbasedev->device_dirty_page_tracking == ON_OFF_AUTO_OFF) && >> - !iommufd_hwpt_dirty_tracking(vbasedev->hwpt)) { >> + vbasedev->device_dirty_page_tracking == ON_OFF_AUTO_OFF) >> +#ifdef CONFIG_IOMMUFD >> + && !iommufd_hwpt_dirty_tracking(vbasedev->hwpt) >> +#endif >> + ) { >> if (vbasedev->enable_migration == ON_OFF_AUTO_AUTO) { >> error_setg(&err, >> "%s: VFIO device doesn't support device and " >> >> I would prefer to avoid the common component to reference IOMMUFD >> directly. The only exception today is the use of the vbasedev->iommufd >> pointer which is treated as opaque. >> >> I guess a simple approach would be to store the result of >> iommufd_hwpt_dirty_tracking(hwpt) under a 'dirty_tracking' attribute > > 'iommu_dirty_tracking' may be. 'dirty_tracking' is already used to > track ongoing logging. >
I can consolidate all that into a new VFIODevice attribute, and drop the hwpt_flags it that helps. I'll try to restructure and try to submit a new version before Zhenzhong wakes up. > > > >> of vbasedev and return the value in vfio_device_iommu_dirty_tracking() ? >> >> if not, let's merge v5 (with more acks) and the fix of plan B. >> >> >>>>> I can follow-up with this improvement in case this gets merged as is, >>>> >>>> I can't merge as is since it break compiles (I am excluding the v5.1 >>>> patch). >>>> Which means I would prefer a v6 please. >>>> >>> >>> Ah OK -- I thought this discussion assumed v5.1 to be in which does fix the >>> compilation issue and all that remained were acks. >> >> v5.1 proposes a CONFIG_IOMMUFD in a header file which is error prone. >> >>>>> or include >>>>> it in the next version if you prefer to adjourn this series into 9.2 >>>>> (given >>>>> the >>>>> lack of time to get everything right). >>>> >>>> There aren't many open questions left. >>>> >>>> * PATCH 5 lacks a R-b. I would feel more confortable if ZhenZhong or >>>> Eric acked the changes. >>>> * PATCH 9 is slightly hacky with the use of vfio_device_get_aw_bits(). >>>> I think it's minor. I would also feel more confortable if ZhenZhong >>>> acked the changes. >>> >>> I guess you meant patch 6 and not 9. >> >> yes. >> >> Thanks, >> >> C. >> >> >> >>> >>>> * PATCH 12 needs the fix we have been talking about. >>>> * PATCH 13 is for dev/debug. >>>> >>>> >>>> What's important is to avoid introducing regressions in the current >>>> behavior, >>>> that is when not using IOMMUFD. It looks fine on that aspect AFAICT. >>> >>> OK >>> >> >