On 6/10/25 12:20, Alexey Kardashevskiy wrote:
On 31/5/25 02:23, Xu Yilun wrote:
On Fri, May 30, 2025 at 12:29:30PM +1000, Alexey Kardashevskiy wrote:
On 30/5/25 00:41, Xu Yilun wrote:
FLR to a bound device is absolutely fine, just break the CC state.
Sometimes it is exactly what host need to stop CC immediately.
The problem is in VFIO's pre-FLR handling so we need to patch
VFIO, not
PCI core.
What is a problem here exactly?
FLR by the host which equals to any other PCI error? The guest
may or may not be able to handle it, afaik it does not handle any
errors now, QEMU just stops the guest.
It is about TDX Connect.
According to the dmabuf patchset, the dmabuf needs to be revoked
before
FLR. That means KVM unmaps MMIOs when the device is in LOCKED/RUN
state.
That is forbidden by TDX Module and will crash KVM.
FLR is something you tell the device to do, how/why would TDX know
about it?
I'm talking about FLR in VFIO driver. The VFIO driver would zap bar
before FLR. The zapping would trigger KVM unmap MMIOs. See
vfio_pci_zap_bars() for legacy case, and see [1] for dmabuf case.
oh I did not know that we do this zapping, thanks for the pointer.
[1] https://lore.kernel.org/kvm/20250307052248.405803-4-
vivek.kasire...@intel.com/
A pure FLR without zapping bar is absolutely OK.
Or it check the TDI state on every map/unmap (unlikely)?
Yeah, TDX Module would check TDI state on every unmapping.
_every_? Reading the state from DOE mailbox is not cheap enough
(imho) to do on every unmap.
Sorry for confusing. TDX firmware just checks if STOP TDI firmware call
is executed, will not check the real device state via DOE. That means
even if device has physically exited to UNLOCKED, TDX host should still
call STOP TDI fwcall first, then MMIO unmap.
So the safer way is
to unbind the TDI first, then revoke MMIOs, then do FLR.
I'm not sure when p2p dma is involved AMD will have the same issue.
On AMD, the host can "revoke" at any time, at worst it'll see RMP
events from IOMMU. Thanks,
Is the RMP event firstly detected by host or guest? If by host,
Host.
host could fool guest by just suppress the event. Guest thought the
DMA writting is successful but it is not and may cause security issue.
An RMP event on the host is an indication that RMP check has failed
and DMA to the guest did not complete so the guest won't see new
data. Same as other PCI errors really. RMP acts like a firewall,
things behind it do not need to know if something was dropped. Thanks,
Not really, guest thought the data is changed but it actually doesn't.
I.e. data integrity is broken.
I am not following, sorry. Integrity is broken when something untrusted
(== other than the SNP guest and the trusted device) manages to write to
the guest encrypted memory successfully. If nothing is written - the
guest can easily see this and do... nothing? Devices have bugs or
spurious interrupts happen, the guest driver should be able to cope with
that.
Data integrity might not be the most accurate way to describe the
situation here. If I understand correctly, the MMIO mapping was
destroyed before the device was unbound (meaning the guest still sees
the device). When the guest issues a P2P write to the device's MMIO, it
will definitely fail, but the guest won't be aware of this failure.
Imagine this on a bare-metal system: if a P2P access targets a device's
MMIO but the device or platform considers it an illegal access, there
should be a bus error or machine check exception. Alternatively, if the
device supports out-of-band AER, the AER driver should then catch and
process these errors.
Therefore, unbinding the device before MMIO invalidation could generally
avoid this.
Thanks,
baolu