On Fri, May 15, 2026 at 02:51:38PM +0000, Michael Kelley wrote:
> From: Yu Zhang <[email protected]> Sent: Friday, May 15, 2026 7:00
> AM
> >
> > On Thu, May 14, 2026 at 06:13:24PM +0000, Michael Kelley wrote:
> > > From: Yu Zhang <[email protected]> Sent: Monday, May 11, 2026
> > > 9:24 AM
> > > >
> > > > Add a para-virtualized IOMMU driver for Linux guests running on Hyper-V.
> > > > This driver implements stage-1 IO translation within the guest OS.
> > > > It integrates with the Linux IOMMU core, utilizing Hyper-V hypercalls
> > > > for:
> > > > - Capability discovery
> > > > - Domain allocation, configuration, and deallocation
> > > > - Device attachment and detachment
> > > > - IOTLB invalidation
> > > >
> > > > The driver constructs x86-compatible stage-1 IO page tables in the
> > > > guest memory using consolidated IO page table helpers. This allows
> > > > the guest to manage stage-1 translations independently of vendor-
> > > > specific drivers (like Intel VT-d or AMD IOMMU).
> > > >
> > > > Hyper-V consumes this stage-1 IO page table when a device domain is
> > > > created and configured, and nests it with the host's stage-2 IO page
> > > > tables, therefore eliminating the VM exits for guest IOMMU mapping
> > > > operations. For unmapping operations, VM exits to perform the IOTLB
> > > > flush are still unavoidable.
> > > >
> > > > Hyper-V identifies each PCI pass-thru device by a logical device ID
> > > > in its hypercall interface. The vPCI driver (pci-hyperv) registers the
> > > > per-bus portion of this ID with the pvIOMMU driver during bus probe.
> > > > The pvIOMMU driver stores this mapping and combines it with the function
> > > > number of the endpoint PCI device to form the complete ID for
> > > > hypercalls.
> > >
> > > As you are probably aware, Mukesh's patch series to support PCI
> > > pass-thru devices also needs to get the logical device ID. Maybe the
> > > registration mechanism needs to move somewhere that can be shared
> > > with his code.
> > >
> >
> > Thank you so much for the review, Michael!
> >
> > Yes, I looked at Mukesh's series and noticed his hv_pci_vmbus_device_id()
> > in pci-hyperv.c has the same dev_instance byte manipulation. We do need
> > a common registration mechanism.
> >
> > Any suggestion on where to put it? drivers/hv/hv_common.c seems like a
> > natural place, but the register/lookup functions are currently only
> > meaningful when CONFIG_HYPERV_PVIOMMU is set. If Mukesh's pass-thru
> > code also needs them, we might need a new shared Kconfig option that
> > both can select. Open to better ideas.
>
> Unfortunately, I have not looked at Mukesh's series in detail yet, so
> I don't have enough knowledge of the full situation to offer a good
> recommendation.
>
Sorry I forgot to Cc Mukesh in the previous reply. :(
@Mukesh, any thoughts on sharing the logical device ID registration mechanism?
> >
> > [...]
> >
> > > > +static void hv_flush_device_domain(struct hv_iommu_domain *hv_domain)
> > > > +{
> > > > + u64 status;
> > > > + unsigned long flags;
> > > > + struct hv_input_flush_device_domain *input;
> > > > +
> > > > + local_irq_save(flags);
> > > > +
> > > > + input = *this_cpu_ptr(hyperv_pcpu_input_arg);
> > > > + memset(input, 0, sizeof(*input));
> > > > + input->device_domain = hv_domain->device_domain;
> > >
> > > The previous version of this patch had code to set several other fields in
> > > the input. I wanted to confirm that not setting them in this version is
> > > intentional. Were they not needed?
> > >
> >
> > Oh. The RFC v1 set partition_id, owner_vtl, domain_id.type, and domain_id.id
> > individually. In this version, I just simplified it to a struct assignment.
> > No functional change.
>
> Of course! I should have looked more closely at the details before making
> this comment. :-(
>
> [...]
>
> > >
> > > Previous versions of this function did hv_iommu_detach_dev(). With that
> > > call
> > > removed from here, hv_iommu_detach_dev() is only called when attaching a
> > > domain to a device that already has a domain attached. Is it the case that
> > > Hyper-V doesn't require the detach as a cleanup step?
> > >
> >
> > The IOMMU core attaches the device to release_domain (our blocking domain)
> > before calling release_device(), so I believe the explicit detach in the RFC
> > was redundant. I simply didn't realize that at the time.
> >
>
> Got it. But after the IOMMU core attaches the device to the blocking
> domain, there's the possibility that the vPCI device is rescinded by
> Hyper-V and it goes away entirely. Or the device might be subjected
> to an "unbind/bind" cycle in Linux. Does the detach need to be done
> on the blocking domain in such cases? In this version of the patches, the
> Hyper-V "attach" and "detach" hypercalls still end up unbalanced. That
> seems a bit untidy at best, and I wonder if there are scenarios where
> Hyper-V will complain about the lack of balance.
>
Thank you, Michael. May I ask what "the vPCI device is rescinded by
Hyper-V and it goes away entirely" mean?
I realized it's a bit untidy. But I want to understand this issue more
clearly first. :)
B.R.
Yu