On Fri, Feb 28, 2025 at 03:25:52PM -0500, Jason Andryuk wrote: > On 2025-02-28 04:36, Roger Pau Monné wrote: > > On Thu, Feb 27, 2025 at 01:28:11PM -0500, Jason Andryuk wrote: > > > On 2025-02-27 05:23, Roger Pau Monné wrote: > > > > On Wed, Feb 26, 2025 at 04:11:25PM -0500, Jason Andryuk wrote: > > > > > To work around this, we can, for per-device IRTs, program the hardware > > > > > to use the guest data & associated IRTE. The address doesn't matter > > > > > since the IRTE handles that, and the Xen address & vector can be used > > > > > as > > > > > expected. > > > > > > > > All this work on AMD because when interrupt remapping is enabled all > > > > MSIs are handled by the remapping table, while on Intel there's still > > > > a bit in the MSI address field to signal whether the MSI is using a > > > > remapping entry, or is using the "compatibility" format (iow: no > > > > remapping). > > > > > > So, on Intel, if the guest hands the device the MSI address, it can > > > decided > > > to bypass remapping? > > > > > > Thanks for providing insight into the Intel inner workings. That's why I > > > am > > > asking. > > > > Yes, sorry, I'm afraid I don't have any good solution for Intel, at > > least not anything similar to what you propose to do on AMD-Vi. I > > guess we could take a partial solution for AMD-Vi only, but it's > > sub-optimal from Xen perspective to have a piece of hardware working > > fine on AMD and not on Intel. > > I only need AMD to work ;) > > But yeah, I thought I should make an effort to get both working.
Kind of tangential to this approach. Do you know which register(s) are used to store the non-architectural MSI address and data fields? I'm wondering if it simply would be easier to introduce a quirk for this device in vPCI (and possibly QEMU?) that intercepts writes to the out of band MSI registers. That should work for both Intel and AMD, but would have the side effect that Xen would need to intercept accesses to at least a full page, and possibly forward accesses to adjacent registers. > > > > > e.g. Replace amd_iommu_perdev_intremap with something generic. > > > > > > > > > > The ath11k device supports and tries to enable 32 MSIs. Linux in PVH > > > > > dom0 and HVM domU fails enabling 32 and falls back to just 1, so that > > > > > is > > > > > all that has been tested. > > > > > > > > DYK why it fails to enable 32? > > > > > > Not exactly - someone else had the card. msi_capability_init() failed. If > > > it ends up in arch_setup_msi_irqs(), only 1 MSI is supported. But > > > precisely > > > where the mutiple nvecs was denied was not tracked down. > > > > Does it also fail on native? I'm mostly asking because it would be > > good to get to the bottom of this, so that we don't come up with a > > partial solution that will break if multi-msi is used later in Linux. > > My understanding is native and PV dom0 work with 32, and it's Linux deciding > not to use multiple MSI. > > It might be this: > static int xen_hvm_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) > { > int irq, pirq; > struct msi_desc *msidesc; > struct msi_msg msg; > > if (type == PCI_CAP_ID_MSI && nvec > 1) > return 1; > > I'll have to look into this more. That shouldn't apply to PVH because it never exposes XENFEAT_hvm_pirqs, and I would expect xen_hvm_setup_msi_irqs() to not get used (otherwise we have a bug somewhere). Thanks, Roger.