On Thu, Oct 14, 2021 at 11:52:08AM -0300, Jason Gunthorpe wrote: > On Thu, Oct 14, 2021 at 03:53:33PM +1100, David Gibson wrote: > > > > My feeling is that qemu should be dealing with the host != target > > > case, not the kernel. > > > > > > The kernel's job should be to expose the IOMMU HW it has, with all > > > features accessible, to userspace. > > > > See... to me this is contrary to the point we agreed on above. > > I'm not thinking of these as exclusive ideas. > > The IOCTL interface in iommu can quite happily expose: > Create IOAS generically > Manipulate IOAS generically > Create IOAS with IOMMU driver specific attributes > HW specific Manipulate IOAS > > IOCTL commands all together. > > So long as everything is focused on a generic in-kernel IOAS object it > is fine to have multiple ways in the uAPI to create and manipulate the > objects. > > When I speak about a generic interface I mean "Create IOAS > generically" - ie a set of IOCTLs that work on most IOMMU HW and can > be relied upon by things like DPDK/etc to always work and be portable. > This is why I like "hints" to provide some limited widely applicable > micro-optimization. > > When I said "expose the IOMMU HW it has with all features accessible" > I mean also providing "Create IOAS with IOMMU driver specific > attributes". > > These other IOCTLs would allow the IOMMU driver to expose every > configuration knob its HW has, in a natural HW centric language. > There is no pretense of genericness here, no crazy foo=A, foo=B hidden > device specific interface. > > Think of it as a high level/low level interface to the same thing.
Ok, I see what you mean.
> > Those are certainly wrong, but they came about explicitly by *not*
> > being generic rather than by being too generic. So I'm really
> > confused aso to what you're arguing for / against.
>
> IMHO it is not having a PPC specific interface that was the problem,
> it was making the PPC specific interface exclusive to the type 1
> interface. If type 1 continued to work on PPC then DPDK/etc would
> never learned PPC specific code.
Ok, but the reason this happened is that the initial version of type 1
*could not* be used on PPC. The original Type 1 implicitly promised a
"large" IOVA range beginning at IOVA 0 without any real way of
specifying or discovering how large that range was. Since ppc could
typically only give a 2GiB range at IOVA 0, that wasn't usable.
That's why I say the problem was not making type1 generic enough. I
believe the current version of Type1 has addressed this - at least
enough to be usable in common cases. But by this time the ppc backend
is already out there, so no-one's had the capacity to go back and make
ppc work with Type1.
> For iommufd with the high/low interface each IOMMU HW should ask basic
> questions:
>
> - What should the generic high level interface do on this HW?
> For instance what should 'Create IOAS generically' do for PPC?
> It should not fail, it should create *something*
> What is the best thing for DPDK?
> I guess the 64 bit window is most broadly useful.
Right, which means the kernel must (at least in the common case) have
the capcity to choose and report a non-zero base-IOVA.
Hrm... which makes me think... if we allow this for the common
kernel-managed case, do we even need to have capcity in the high-level
interface for reporting IO holes? If the kernel can choose a non-zero
base, it could just choose on x86 to place it's advertised window
above the IO hole.
> - How to accurately describe the HW in terms of standard IOAS objects
> and where to put HW specific structs to support this.
>
> This is where PPC would decide how best to expose a control over
> its low/high window (eg 1,2,3 IOAS). Whatever the IOMMU driver
> wants, so long as it fits into the kernel IOAS model facing the
> connected device driver.
>
> QEMU would have IOMMU userspace drivers. One would be the "generic
> driver" using only the high level generic interface. It should work as
> best it can on all HW devices. This is the fallback path you talked
> of.
>
> QEMU would also have HW specific IOMMU userspace drivers that know how
> to operate the exact HW. eg these drivers would know how to use
> userspace page tables, how to form IOPTEs and how to access the
> special features.
>
> This is how QEMU could use an optimzed path with nested page tables,
> for instance.
The concept makes sense in general. The devil's in the details, as usual.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature
_______________________________________________ iommu mailing list [email protected] https://lists.linuxfoundation.org/mailman/listinfo/iommu
