On Tue, 8 Feb 2022 16:01:48 +0000 Peter Maydell <peter.mayd...@linaro.org> wrote:
> On Tue, 8 Feb 2022 at 15:56, Eric Auger <eric.au...@redhat.com> wrote: > > > > Hi Peter, > > > > On 2/8/22 4:17 PM, Peter Maydell wrote: > > > On Tue, 8 Feb 2022 at 15:08, Eric Auger <eric.au...@redhat.com> wrote: > > >> Representing the CRB cmd/response buffer as a standard > > >> RAM region causes some trouble when the device is used > > >> with VFIO. Indeed VFIO attempts to DMA_MAP this region > > >> as usual RAM but this latter does not have a valid page > > >> size alignment causing such an error report: > > >> "vfio_listener_region_add received unaligned region". > > >> To allow VFIO to detect that failing dma mapping > > >> this region is not an issue, let's use a ram_device > > >> memory region type instead. > > > This seems like VFIO's problem to me. There's nothing > > > that guarantees alignment for memory regions at all, > > > whether they're RAM, IO or anything else. > > > > VFIO dma maps all the guest RAM. > > Well, it can if it likes, but "this is a RAM-backed MemoryRegion" > doesn't imply "this is really guest actual RAM RAM", so if it's > using that as its discriminator it should probably use something else. > What is it actually trying to do here ? VFIO is device agnostic, we don't understand the device programming model, we can't know how the device is programmed to perform DMA. The only way we can provide transparent assignment of arbitrary PCI devices is to install DMA mappings for everything in the device AddressSpace through the system IOMMU. If we were to get a sub-page RAM mapping through the MemoryListener and that mapping had the possibility of being a DMA target, then we have a problem, because we cannot represent that through the IOMMU. If the device were to use that address for DMA, we'd likely have data loss/corruption in the VM. AFAIK, and I thought we had some general agreement on this, declaring device memory as ram_device is the only means we have to differentiate MemoryRegion segments generated by a device from actual system RAM. For device memory, we can lean on the fact that peer-to-peer DMA is much more rare and likely involves some degree of validation by the drivers since it can be blocked on physical hardware due to various topology and chipset related issues. Therefore we can consider failures to map device memory at a lower risk than failures to map ranges we think are actual system RAM. Are there better approaches? We can't rely on the device sitting behind a vIOMMU in the guest to restrict the address space and we can't afford the performance hit for dyanmic DMA mappings through a vIOMMU either. Thanks, Alex