On Mon, 18 Dec 2017 20:04:23 +1100 Alexey Kardashevskiy <a...@ozlabs.ru> wrote:
> On 18/12/17 16:02, Alex Williamson wrote: > > With recently proposed kernel side vfio-pci changes, the MSI-X vector > > table area can be mmap'd from userspace, allowing direct access to > > non-MSI-X registers within the host page size of this area. However, > > we only get that direct access if QEMU isn't also emulating MSI-X > > within that same page. For x86/64 host, the system page size is 4K > > and the PCI spec recommends a minimum of 4K to 8K alignment to > > separate MSI-X from non-MSI-X registers, therefore only devices which > > don't honor this recommendation would see any improvement from this > > option. The real targets for this feature are hosts where the page > > size exceeds the PCI spec recommended alignment, such as ARM64 systems > > with 64K pages. > > > > This new x-msix-relocation option accepts the following options: > > > > off: Disable MSI-X relocation, use native device config (default) > > auto: Automaically relocate MSI-X MMIO to another BAR or offset > > based on minimum additional MMIO requirement > > bar0..bar5: Specify the target BAR, which will either be extended > > if the BAR exists or added if the BAR slot is available. > > > While I am digesting the patchset, here are some test results. Thanks for testing! > This is the device: > > 00:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3008 > PCI-Express Fusion-MPT SAS-3 (rev 02) BAR1: > Memory at 210000000000 (64-bit, non-prefetchable) [size=64K] BAR3: > Memory at 210000040000 (64-bit, non-prefetchable) [size=256K] > > Capabilities: [c0] MSI-X: Enable+ Count=96 Masked- > Vector table: BAR=1 offset=0000e000 > PBA: BAR=1 offset=0000f000 > > > Test #1: x-msix-relocation = "off": > > FlatView #1 > AS "memory", root: system > AS "cpu-memory", root: system > Root memory region: system > 0000000000000000-000000007fffffff (prio 0, ram): ppc_spapr.ram > 0000210000000000-000021000000dfff (prio 0, i/o): 0001:03:00.0 BAR 1 > 000021000000e000-000021000000e5ff (prio 0, i/o): msix-table > 000021000000e600-000021000000ffff (prio 0, i/o): 0001:03:00.0 BAR 1 > @000000000000e600 > 0000210000040000-000021000007ffff (prio 0, ramd): 0001:03:00.0 BAR 3 > mmaps[0] > > Ok, works. > > > Test #2: x-msix-relocation = "auto": > > FlatView #2 > AS "memory", root: system > AS "cpu-memory", root: system > Root memory region: system > 0000000000000000-000000007fffffff (prio 0, ram): ppc_spapr.ram > 0000200080000000-00002000800005ff (prio 0, i/o): msix-table > 0000200080000600-000020008000ffff (prio 1, i/o): 0001:03:00.0 base BAR 0 > @0000000000000600 > 0000210000000000-000021000000ffff (prio 0, i/o): 0001:03:00.0 BAR 1 > 0000210000040000-000021000007ffff (prio 0, ramd): 0001:03:00.0 BAR 3 > mmaps[0] > > > The guest fails probing because the first 64bit BAR is broken. > > lspci: > > Region 0: Memory at 200080000000 (32-bit, prefetchable) [size=64K] > Region 1: Memory at 210000000000 (64-bit, non-prefetchable) [size=64K] > Region 3: Memory at 210000040000 (64-bit, non-prefetchable) [size=256K] > > Capabilities: [c0] MSI-X: Enable- Count=96 Masked- > Vector table: BAR=0 offset=00000000 > PBA: BAR=0 offset=00000600 Why do you suppose it's broken? The added BAR0 is 32bit, it cannot be 64bit since BAR1 is implemented. I don't see anything fundamentally different between this and the working BAR5 test below. > Test #3: x-msix-relocation = "bar1" > > > FlatView #1 > AS "memory", root: system > AS "cpu-memory", root: system > Root memory region: system > 0000000000000000-000000007fffffff (prio 0, ram): ppc_spapr.ram > 0000210000000000-000021000000ffff (prio 0, i/o): 0001:03:00.0 BAR 1 > 0000210000010000-00002100000105ff (prio 0, i/o): msix-table > 0000210000010600-000021000001ffff (prio 1, i/o): 0001:03:00.0 base BAR 1 > @0000000000010600 > 0000210000040000-000021000007ffff (prio 0, ramd): 0001:03:00.0 BAR 3 > mmaps[0] > > Ok, works. BAR1 became 128K. However no part of BAR1 was mapped, i.e. > appear as "ramd" in flatview, should it have appeared? > > This is "mtree": > > memory-region: p...@800000020000000.mmio > 0000000000000000-ffffffffffffffff (prio 0, i/o): p...@800000020000000.mmio > 0000210000000000-000021000001ffff (prio 1, i/o): 0001:03:00.0 base BAR 1 > 0000210000000000-000021000000ffff (prio 0, i/o): 0001:03:00.0 BAR 1 > 0000210000010000-00002100000105ff (prio 0, i/o): msix-table > 0000210000010600-000021000001060f (prio 0, i/o): msix-pba [disabled] > 0000210000040000-000021000007ffff (prio 1, i/o): 0001:03:00.0 base BAR 3 > 0000210000040000-000021000007ffff (prio 0, i/o): 0001:03:00.0 BAR 3 > 0000210000040000-000021000007ffff (prio 0, ramd): 0001:03:00.0 BAR > 3 mmaps[0] Did you disable vfio_pci_fixup_msix_region() as noted in 0/5? This series doesn't do anything about consuming the new MSI-X mappable flag that you introduced in the kernel, so vfio_pci_fixup_msix_region() will continue to exclude mmap'ing the 64K page overlapping the actual BAR. > Test #4: x-msix-relocation = "bar5" > > The same net result as test #3: it works but BAR1 is not mapped: > > > Region 1: Memory at 210000000000 (64-bit, non-prefetchable) [size=64K] > Region 3: Memory at 210000040000 (64-bit, non-prefetchable) [size=256K] > Region 5: Memory at 200080000000 (32-bit, prefetchable) [size=64K] > > Capabilities: [c0] MSI-X: Enable+ Count=96 Masked- > Vector table: BAR=5 offset=00000000 > PBA: BAR=5 offset=00000600 > > FlatView #0 > AS "memory", root: system > AS "cpu-memory", root: system > Root memory region: system > 0000000000000000-000000007fffffff (prio 0, ram): ppc_spapr.ram > 0000200080000000-00002000800005ff (prio 0, i/o): msix-table > 0000200080000600-000020008000ffff (prio 1, i/o): 0001:03:00.0 base BAR 5 > @0000000000000600 > 0000210000000000-000021000000ffff (prio 0, i/o): 0001:03:00.0 BAR 1 > 0000210000040000-000021000007ffff (prio 0, ramd): 0001:03:00.0 BAR 3 > mmaps[0] > > > memory-region: p...@800000020000000.mmio > 0000000000000000-ffffffffffffffff (prio 0, i/o): p...@800000020000000.mmio > 0000000080000000-000000008000ffff (prio 1, i/o): 0001:03:00.0 base BAR 5 > 0000000080000000-00000000800005ff (prio 0, i/o): msix-table > 0000000080000600-000000008000060f (prio 0, i/o): msix-pba [disabled] > 0000210000000000-000021000000ffff (prio 1, i/o): 0001:03:00.0 base BAR 1 > 0000210000000000-000021000000ffff (prio 0, i/o): 0001:03:00.0 BAR 1 > 0000210000040000-000021000007ffff (prio 1, i/o): 0001:03:00.0 base BAR 3 > 0000210000040000-000021000007ffff (prio 0, i/o): 0001:03:00.0 BAR 3 > 0000210000040000-000021000007ffff (prio 0, ramd): 0001:03:00.0 BAR > 3 mmaps[0] As above, you won't get the mmap without disabling the implicit page exclusion. The real question for this case is why does it work while 'auto' came up with a nearly identical layout, swapping BAR5 for BAR0 and it did not work. The placement of the BARs is even the same. > and there is also one minor comment below. > > > > @@ -2845,13 +2942,14 @@ static void vfio_realize(PCIDevice *pdev, Error > > **errp) > > > > vfio_pci_size_rom(vdev); > > > > + vfio_bars_prepare(vdev); > > + > > vfio_msix_early_setup(vdev, &err); > > if (err) { > > error_propagate(errp, err); > > goto error; > > } > > > > - vfio_bars_prepare(vdev); > > > This could be in 2/5. It could, but 2/5 was attempting to add the base BAR MemoryRegion and split vfio_bars_setup() into vfio_bars_prepare() and vfio_bars_register() without otherwise changing the ordering. It's only when we want to modify BARs between prepare and register that we need to make this change, thus it's done here. Thanks, Alex