On 3/19/25 1:00 PM, Eric Auger wrote:
Hi,
On 3/19/25 1:23 AM, Jason Gunthorpe wrote:
On Tue, Mar 18, 2025 at 05:22:51PM -0400, Donald Dutile wrote:
I agree with Eric that 'accel' isn't needed -- this should be
ascertained from the pSMMU that a physical device is attached to.
I seem to remember the point was made that we don't actually know if
accel is possible, or desired, especially in the case of hotplug.
that's why I think it would be better if we could instantiate a single
type of device that can do both accel and non accel mode.
Maybe that would be at the price of always enforcing MSI resv regions on
guest to assure MSI nesting is possible.
The accelerated mode has a number of limitations that the software
mode does not have. I think it does make sense that the user would
deliberately choose to use a more restrictive operating mode and then
would have to meet the requirements - eg by creating the required
number and configuration of vSMMUs.
To avoid any misunderstanding I am not pushing for have a single vSMMU
instance. I advocate for having several instances, each somehow
specialized for VFIO devices or emulated devices. Maybe we can opt-in
with accel=on but the default could be auto (the property can be
AUTO_ON_OFF) where the code detects if a VFIO device is translated.In
case incompatible devices are translated into a same vSMMU instance I
guess it could be detected and will fail.
What I am pusshing for is to have a single type of QEMU device which can
do both accel and non accel.
+1 !
In general I advocate for having several vSMMU instances, each of them
Now... how does vfio(?; why not qemu?) layer determine that? --
where are SMMUv3 'accel' features exposed either: a) in the device
struct (for the smmuv3) or (b) somewhere under sysfs? ... I couldn't
find anything under either on my g-h system, but would appreciate a
ptr if there is.
I think it is not discoverable yet other thatn through
try-and-fail. Discoverability would probably be some bits in an
iommufd GET_INFO ioctl or something like that.
yeah but at least we can easily detect if a VFIO device is beeing
translated by a vSMMU instance in which case there is no other choice to
turn accel on.
Thanks
Eric
and like Eric, although 'accel' is better than the
original 'nested', it's non-obvious what accel feature(s) are being
turned on, or not.
There are really only one accel feature - direct HW usage of the IO
Page table in the guest (no shadowing).
A secondary addon would be direct HW usage of an invalidation queue in
the guest.
kernel boot-param will be needed; if in sysfs, a write to 0 an
enable(disable) it maybe an alternative as well. Bottom line: we
need a way to (a) ascertain the accel feature (b) a way to disable
it when it is broken, so qemu's smmuv3 spec will 'just work'.
You'd turned it off by not asking qemu to use it, that is sort of the
reasoning behind the command line opt in for accel or not.
Jason