On 4/3/25 22:33, Farhan Ali wrote:
On 4/3/2025 11:05 AM, Alex Williamson wrote:
On Thu, 3 Apr 2025 10:33:52 -0700
Farhan Ali <al...@linux.ibm.com> wrote:
On 4/3/2025 9:27 AM, Alex Williamson wrote:
On Thu, 3 Apr 2025 11:44:42 -0400
Stefan Hajnoczi <stefa...@redhat.com> wrote:
On Thu, Apr 03, 2025 at 09:47:26AM +0200, Niklas Schnelle wrote:
On Wed, 2025-04-02 at 11:51 -0400, Stefan Hajnoczi wrote:
On Tue, Apr 01, 2025 at 10:22:43AM -0700, Farhan Ali wrote:
Hi,
Recently on s390x we have enabled mmap support for vfio-pci devices [1].
Hi Alex,
I wanted to bring this to your attention. Feel free to merge it through
the VFIO tree, otherwise I will merge it once you have taken a look.
Thanks,
Stefan
This allows us to take advantage and use userspace drivers on s390x. However,
on s390x we have special instructions for MMIO access. Starting with z15
(and newer platforms) we have new PCI Memory I/O (MIO) instructions which
operate on virtually mapped PCI memory spaces, and can be used from userspace.
On older platforms we would fallback to using existing system calls for MMIO
access.
This patch series introduces support the PCI MIO instructions, and enables s390x
support for the userspace NVMe driver on s390x. I would appreciate any
review/feedback
on the patches.
Thanks
Farhan
Hi Stefan,
the kernel patch actually made it into Linus' tree for v6.15 already as
commit aa9f168d55dc ("s390/pci: Support mmap() of PCI resources except
for ISM devices") plus prerequisites. This went via the PCI tree
because they included a change to struct pci_dev and also enabled
mmap() on PCI resource files. Alex reviewed an earlier version and was
the one who suggested to also enable mmap() on PCI resources.
The introduction of a new QEMU API for accessing MMIO BARs in this
series is something Alex might be interested in as QEMU VFIO maintainer.
That wouldn't have been part of the kernel patch review.
If he's aware of the new API he can encourage other VFIO users to use it
in the future so that you won't need to convert them to work on s390x
again.
I don't claim any jurisdiction over the vfio-nvme driver. In general
vfio users should be using either vfio_region_ops, ram_device_mem_ops,
or directly mapping MMIO into the VM address space. The first uses
pread/write through the region offset, irrespective of the type of
memory, the second provides the type of access used here where we're
dereferencing into an mmap, and the last if of course the preferred
mechanism where available.
It is curious that the proposal here doesn't include any changes to
ram_device_mem_ops for more generically enabling MMIO access on s390x.
Thanks,
Alex
Hi Alex,
From my understanding the ram_device_mem_ops sets up the BAR access for
a guest passthrough device. Unfortunately today an s390x KVM guest
doesn't use and have support for these MIO instructions. We wanted to
use this series as an initial test vehicle of the mmap support.
Right, ram_device_mem_ops is what we'll use to access a BAR that
supports mmap but for whatever reason we're accessing it directly
through the mmap. For instance if an overlapping quirk prevents the
page from being mapped to the VM or we have some back channel mechanism
where the VMM is interacting with the BAR.
I bring it up here because it's effectively the same kind of access
you're adding with these helpers and would need to be addressed if this
were generically enabling vfio mmap access on s390x.
On s390x the use of the MIO instructions is limited to only PCI access. So i am
not sure if we should generically apply this to all vfio mmap access (for non
PCI devices).
Prior to commit 2b8fe81b3c2e ("system/memory: use ldn_he_p/stn_he_p")
the mmio helpers here might have been a drop-in replacement for the
dereferencing of mmap offsets, but something would need to be done
about the explicit PCI assumption introduced here and the possibility
of unaligned accesses that the noted commit tries to resolve. Thanks,
Alex
AFAICT in qemu today the ram_device_mem_ops is used for non PCI vfio mmap
cases. For s390x these helpers should be restricted to PCI accesses. For the
unaligned accesses (thanks for pointing out that commmit!), are you suggesting
we use the ld*_he_p/st*_he_p functions in the helpers i defined? Though those
functions don't seem to be doing volatile accesses.
I think that's fine. We had the same problem to deal with the XIVE
ESB MMIO pages. See xive_esb_rw() in hw/intc/spapr_xive_kvm.c.
Thanks,
C.