On 01/29/2016 10:54 AM, Alex Williamson wrote: > On Fri, 2016-01-29 at 02:22 +0000, Kay, Allen M wrote: >> >>> -----Original Message----- >>> From: iGVT-g [mailto:igvt-g-boun...@lists.01.org] On Behalf Of Alex >>> Williamson >>> Sent: Thursday, January 28, 2016 11:36 AM >>> To: Gerd Hoffmann; qemu-devel@nongnu.org >>> Cc: igv...@ml01.01.org; xen-de...@lists.xensource.com; Eduardo Habkost; >>> Stefano Stabellini; Cao jin; vfio-us...@redhat.com >>> Subject: Re: [iGVT-g] [vfio-users] [PATCH v3 00/11] igd passthrough chipset >>> tweaks >>> >>> >>> 1) The OpRegion MemoryRegion is mapped into system_memory through >>> programming of the 0xFC config space register. >>> a) vfio-pci could pick an address to do this as it is realized. >>> b) SeaBIOS/OVMF could program this. >>> >>> Discussion: 1.a) Avoids any BIOS dependency, but vfio-pci would need to pick >>> an address and mark it as e820 reserved. I'm not sure how to pick that >>> address. We'd probably want to make the 0xFC config register read- >>> only. 1.b) has the issue you mentioned where in most cases the OpRegion >>> will be 8k, but the BIOS won't know how much address space it's mapping >>> into system memory when it writes the 0xFC register. I don't know how >>> much of a problem this is since the BIOS can easily determine the size once >>> mapped and re-map it somewhere there's sufficient space. >>> Practically, it seems like it's always going to be 8K. This of course >>> requires >>> modification to every BIOS. It also leaves the 0xFC register as a mapping >>> control rather than a pointer to the OpRegion in RAM, which doesn't really >>> match real hardware. The BIOS would need to pick an address in this case. >>> >>> 2) Read-only mappings version of 1) >>> >>> Discussion: Really nothing changes from the issues above, just prevents any >>> possibility of the guest modifying anything in the host. Xen apparently >>> allows >>> write access to the host page already. >>> >>> 3) Copy OpRegion contents into buffer and do either 1) or 2) above. >>> >>> Discussion: No benefit that I can see over above other than maybe allowing >>> write access that doesn't affect the host. >>> >>> 4) Copy contents into a guest RAM location, mark it reserved, point to it >>> via >>> 0xFC config as scratch register. >>> a) Done by QEMU (vfio-pci) >>> b) Done by SeaBIOS/OVMF >>> >>> Discussion: This is the most like real hardware. 4.a) has the usual issue >>> of >>> how to pick an address, but the benefit of not requiring BIOS changes >>> (simply >>> mark the RAM reserved via existing methods). 4.b) would require passing a >>> buffer containing the contents of the OpRegion via fw_cfg and letting the >>> BIOS do the setup. The latter of course requires modifying each BIOS for >>> this >>> support. >>> >>> Of course none of these support hotplug nor really can they since reserved >>> memory regions are not dynamic in the architecture. >>> >>> In all cases, some piece of software needs to know where it can place the >>> OpRegion in guest memory. It seems like there are advantages or >>> disadvantages whether that's done by QEMU or the BIOS, but we only need >>> to do it once if it's QEMU. Suggestions, comments, preferences? >>> >> >> Hi Alex, another thing to consider is how to communicate to the guest driver >> the address at 0xFC contains a valid GPA address that can be accessed by the >> driver without causing a EPT fault - since >> the same driver will be used on other hypervisors and they may not EPT map >> OpRegion memory. On idea proposed by display driver team is to set bit0 of >> the address to 1 for indicating OpRegion memory >> can be safely accessed by the guest driver. > > Hi Allen, > > Why is that any different than a guest accessing any other memory area > that it shouldn't? The OpRegion starts with a 16-byte ID string, so if > the guest finds that it should feel fairly confident the OpRegion data > is valid. The published spec also seems to define all bits of 0xfc as > valid, not implying any sort of alignment requirements, and the i915 > driver does a memremap directly on the value read from 0xfc. So I'm not > sure whether there's really a need to or ability to define any of those > bits in an adhoc way to indicate mapping. If we do things right, > shouldn't the guest driver not even know it's running in a VM, at least > for the KVMGT-d case, so we need to be compatible with physical > hardware. Thanks, >
I agree. EPT page fault is allowed on guest OpRegion accessing, as long as during the page fault handling, KVM will find a proper PFN for that GPA. It's exactly what is expected for 'normal' memory. > Alex > -- Thanks, Jike