On Thu, May 15, 2025 at 01:06:20PM -0300, Jason Gunthorpe wrote: > On Thu, May 08, 2025 at 08:02:32PM -0700, Nicolin Chen wrote: > > +/** > > + * struct iommu_hw_queue_alloc - ioctl(IOMMU_HW_QUEUE_ALLOC) > > + * @size: sizeof(struct iommu_hw_queue_alloc) > > + * @flags: Must be 0 > > + * @viommu_id: Virtual IOMMU ID to associate the HW queue with > > + * @type: One of enum iommu_hw_queue_type > > + * @index: The logical index to the HW queue per virtual IOMMU for a > > multi-queue > > + * model > > + * @out_hw_queue_id: The ID of the new HW queue > > + * @base_addr: Base address of the queue memory in guest physical address > > space > > + * @length: Length of the queue memory in the guest physical address space > > + * > > + * Allocate a HW queue object for a vIOMMU-specific HW-accelerated queue, > > which > > + * allows HW to access a guest queue memory described by @base_addr and > > @length. > > + * Upon success, the underlying physical pages of the guest queue memory > > will be > > + * pinned to prevent VMM from unmapping them in the IOAS until the HW > > queue gets > > + * destroyed. > > Do we have way to make the pinning optional? > > As I understand AMD's system the iommu HW itself translates the > base_addr through the S2 page table automatically, so it doesn't need > pinned memory and physical addresses but just the IOVA.
Right. That's why we invented a flag, and it should be probably extended to cover the pin step as well. > Perhaps for this reason the pinning should be done with a function > call from the driver? But the whole point of doing in the core was to avoid the entire iopt related structure/function from being exposed to the driver, which would otherwise hugely increase the size of the driver.o? > > +struct iommu_hw_queue_alloc { > > + __u32 size; > > + __u32 flags; > > + __u32 viommu_id; > > + __u32 type; > > + __u32 index; > > + __u32 out_hw_queue_id; > > + __aligned_u64 base_addr; > > base addr should probably be called nesting_parent_iova to match how > we described the viommu hwpt: > > * @hwpt_id: ID of a nesting parent HWPT to associate to Ack. > > + /* > > + * The underlying physical pages must be pinned to prevent them from > > + * being unmapped (via IOMMUFD_CMD_IOAS_UNMAP) during the life cycle > > + * of the HW QUEUE object. > > + */ > > + rc = iopt_pin_pages(&hwpt->ioas->iopt, cmd->base_addr, cmd->length, > > + pages, 0); > > I don't think this actually works like this without an unmap > callback. unmap will break: > > iommufd_access_notify_unmap(iopt, area_first, length); > /* Something is not responding to unmap requests. */ > tries++; > if (WARN_ON(tries > 100)) > return -EDEADLOCK; > > If it can't shoot down the pinning. Hmm, I thought we want the unmap to fail until VMM releases the HW QUEUE first? In what case, does VMM wants to unmap while holding the queue pages? > Why did this need to change away from just a normal access? That ops > and unmap callback are not optional things. > > What vcmdq would do in the unmap callback is question I'm not sure > of.. > > This is more reason to put the pin/access in the driver so it can > provide an unmap callback that can fix it up. As there are two types of "access" here.. you mean iommufd_access, i.e. vcmdq driver should hold an iommufd_access like an emulated vfio device driver? > I think this should have > been done just by using the normal access mechanism, maybe with a > simplifying wrapper for in-driver use. ie no need for patch #9 Still, the driver.o file will be very large, unlike VFIO that just depends on CONFIG_IOMMUFD? Thanks Nicolin