On Wed, Apr 23, 2025 at 08:55:51AM -0300, Jason Gunthorpe wrote:
> On Wed, Apr 23, 2025 at 08:05:49AM +0000, Tian, Kevin wrote:
> 
> > It's not a good idea having the kernel trust the VMM. 
> 
> It certainly shouldn't trust it, but it can validate the VMM's choice
> and generate a failure if it isn't good.
> 
> > Also I'm not
> > sure the contiguity is guaranteed all the time with huge page
> > (e.g. if just using THP).
> 
> If things are aligned then the contiguity will work out. Ie a 64K
> aligned allocation on a 2M GPA is fine. I don't think there are
> edge cases where a GPA will be fragmented. It does rely on the VMM
> always getting some kind of huge page and then pinning it in iommufd.

With QEMU that does ensure the alignment when using system huge
pages, I haven't seen any edge problem yet.

> IMHO this is bad HW design, but it is what it is..
> 
> > btw does smmu only read the cmdq or also update some fields
> > in the queue? If the latter, then it also brings a security hole 
> > as a malicious  VMM could violate the contiguity requirement
> > to instruct the smmu to touch pages which don't belong to 
> > it...
> 
> This really must be prevented. I haven't looked closely here, but the
> GPA -> PA mapping should go through the IOAS and that should generate
> a page list and that should be validated for contiguity.
> 
> It also needs to act like a mdev and lock down the part of the IOAS
> that provides that memory so the pin can't be released and UAF things.

If I capture this correctly, the GPA->PA mapping is already done
at the IOAS level for the S2 HWPT/domain, i.e. pages are already
pinned. So we just need to a pair of for-driver APIs to validate
the contiguity and refcount pages calling iopt_area_add_access().

Thanks
Nicolin

Reply via email to