viommu: Add IOMMUFD_CMD_HW_QUEUE_ALLOC ioctl

Nicolin Chen Thu, 15 May 2025 11:17:09 -0700

On Thu, May 15, 2025 at 01:06:20PM -0300, Jason Gunthorpe wrote:
> On Thu, May 08, 2025 at 08:02:32PM -0700, Nicolin Chen wrote:
> > +/**
> > + * struct iommu_hw_queue_alloc - ioctl(IOMMU_HW_QUEUE_ALLOC)
> > + * @size: sizeof(struct iommu_hw_queue_alloc)
> > + * @flags: Must be 0
> > + * @viommu_id: Virtual IOMMU ID to associate the HW queue with
> > + * @type: One of enum iommu_hw_queue_type
> > + * @index: The logical index to the HW queue per virtual IOMMU for a 
> > multi-queue
> > + *         model
> > + * @out_hw_queue_id: The ID of the new HW queue
> > + * @base_addr: Base address of the queue memory in guest physical address 
> > space
> > + * @length: Length of the queue memory in the guest physical address space
> > + *
> > + * Allocate a HW queue object for a vIOMMU-specific HW-accelerated queue, 
> > which
> > + * allows HW to access a guest queue memory described by @base_addr and 
> > @length.
> > + * Upon success, the underlying physical pages of the guest queue memory 
> > will be
> > + * pinned to prevent VMM from unmapping them in the IOAS until the HW 
> > queue gets
> > + * destroyed.
> 
> Do we have way to make the pinning optional?
> 
> As I understand AMD's system the iommu HW itself translates the
> base_addr through the S2 page table automatically, so it doesn't need
> pinned memory and physical addresses but just the IOVA.


Right. That's why we invented a flag, and it should be probably
extended to cover the pin step as well.

> Perhaps for this reason the pinning should be done with a function
> call from the driver?

But the whole point of doing in the core was to avoid the entire
iopt related structure/function from being exposed to the driver,
which would otherwise hugely increase the size of the driver.o?

> > +struct iommu_hw_queue_alloc {
> > +   __u32 size;
> > +   __u32 flags;
> > +   __u32 viommu_id;
> > +   __u32 type;
> > +   __u32 index;
> > +   __u32 out_hw_queue_id;
> > +   __aligned_u64 base_addr;
> 
> base addr should probably be called nesting_parent_iova  to match how
> we described the viommu hwpt:
> 
>  * @hwpt_id: ID of a nesting parent HWPT to associate to

Ack.

> > +   /*
> > +    * The underlying physical pages must be pinned to prevent them from
> > +    * being unmapped (via IOMMUFD_CMD_IOAS_UNMAP) during the life cycle
> > +    * of the HW QUEUE object.
> > +    */
> > +   rc = iopt_pin_pages(&hwpt->ioas->iopt, cmd->base_addr, cmd->length,
> > +                       pages, 0);
> 
> I don't think this actually works like this without an unmap
> callback. unmap will break:
> 
>                       iommufd_access_notify_unmap(iopt, area_first, length);
>                       /* Something is not responding to unmap requests. */
>                       tries++;
>                       if (WARN_ON(tries > 100))
>                               return -EDEADLOCK;
> 
> If it can't shoot down the pinning.

Hmm, I thought we want the unmap to fail until VMM releases the HW
QUEUE first? In what case, does VMM wants to unmap while holding
the queue pages?

> Why did this need to change away from just a normal access? That ops
> and unmap callback are not optional things.
> 
> What vcmdq would do in the unmap callback is question I'm not sure
> of..
> 
> This is more reason to put the pin/access in the driver so it can
> provide an unmap callback that can fix it up.

As there are two types of "access" here.. you mean iommufd_access,
i.e. vcmdq driver should hold an iommufd_access like an emulated
vfio device driver?

> I think this should have
> been done just by using the normal access mechanism, maybe with a
> simplifying wrapper for in-driver use. ie no need for patch #9

Still, the driver.o file will be very large, unlike VFIO that just
depends on CONFIG_IOMMUFD?

Thanks
Nicolin

Re: [PATCH v4 11/23] iommufd/viommu: Add IOMMUFD_CMD_HW_QUEUE_ALLOC ioctl

Reply via email to