> From: Andrew Cooper <andrew.coop...@citrix.com>
> Sent: Wednesday, April 27, 2022 1:52 AM
> 
> Hello,
> 
> Edvin has found a machine with some very weird properties.  It is an HP
> ProLiant BL460c Gen8 with:
> 
>  \-[0000:00]-+-00.0  Intel Corporation Xeon E5/Core i7 DMI2
>              +-01.0-[11]--
>              +-01.1-[02]--
>              +-02.0-[04]--+-00.0  Emulex Corporation OneConnect 10Gb NIC
> (be3)
>              |            +-00.1  Emulex Corporation OneConnect 10Gb NIC
> (be3)
>              |            +-00.2  Emulex Corporation OneConnect 10Gb
> iSCSI Initiator (be3)
>              |            \-00.3  Emulex Corporation OneConnect 10Gb
> iSCSI Initiator (be3)
> 
> yet all 4 other functions on the device periodically hit IOMMU faults
> (~once every 5 mins, so definitely stats).
> 
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.4] fault addr
> bdf80000
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.5] fault addr
> bdf80000
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.6] fault addr
> bdf80000
> (XEN) [VT-D]DMAR:[DMA Write] Request device [0000:04:00.7] fault addr
> bdf80000
> 
> There are several RMRRs covering the these devices, with:
> 
> (XEN) [VT-D]found ACPI_DMAR_RMRR:
> (XEN) [VT-D] endpoint: 0000:03:00.0
> (XEN) [VT-D] endpoint: 0000:01:00.0
> (XEN) [VT-D] endpoint: 0000:01:00.2
> (XEN) [VT-D] endpoint: 0000:04:00.0
> (XEN) [VT-D] endpoint: 0000:04:00.1
> (XEN) [VT-D] endpoint: 0000:04:00.2
> (XEN) [VT-D] endpoint: 0000:04:00.3
> (XEN) [VT-D]dmar.c:608:   RMRR region: base_addr bdf8f000 end_addr
> bdf92fff
> 
> being the one relevant to these faults.  I've not manually decoded the
> DMAR table because device paths are horrible to follow but there are at
> least the correct number of endpoints.  The functions all have SR-IOV
> (disabled) and ARI (enabled).  None have any Phantom functions described.
> 
> Specifying pci-phantom=04:00,1 does appear to work around the faults,
> but it's not right, because functions 1 thru 3 aren't actually phantom.
> 
> Also, I don't see any logic which actually wires up phantom functions
> like this to share RMRRs/IVMDs in IO contexts.  The faults only
> disappear as a side effect of 04:00.0 and 04:00.4 being in dom0, as far
> as I can tell.
> 
> Simply giving the RMRR via rmrr= doesn't work (presumably because of no
> patching actual devices, but there's no warning), but it feels as if it
> ought to.
> 

What is the Xen version? Does it include Jan's change for per-device
quarantine?

btw it's weird why those NIC devices require RMRR in the first place...

Reply via email to