On 24.02.25 14:41, Albert Esteve wrote:
On Mon, Feb 24, 2025 at 10:49 AM David Hildenbrand <da...@redhat.com> wrote:

On 24.02.25 10:35, Albert Esteve wrote:
On Mon, Feb 24, 2025 at 10:16 AM David Hildenbrand <da...@redhat.com> wrote:

On 24.02.25 09:54, Albert Esteve wrote:
On Mon, Feb 17, 2025 at 9:01 PM David Hildenbrand <da...@redhat.com> wrote:

On 17.02.25 17:40, Albert Esteve wrote:
Hi all,


Hi,

looks like our debugging session was successfu :)

One question below.

v3->v4
- Change mmap strategy to use RAM blocks
      and subregions.
- Add new bitfield to qmp feature map
- Followed most review comments from
      last iteration.
- Merged documentation patch again with
      this one. Makes more sense to
      review them together after all.
- Add documentation for MEM_READ/WRITE
      messages.

The goal of this patch is to support
dynamic fd-backed memory maps initiated
from vhost-user backends.
There are many devices that could already
benefit of this feature, e.g.,
virtiofs or virtio-gpu.

After receiving the SHMEM_MAP/UNMAP request,
the frontend creates the RAMBlock form the
fd and maps it by adding it as a subregion
of the shared memory region container.

The VIRTIO Shared Memory Region list is
declared in the `VirtIODevice` struct
to make it generic.

TODO: There was a conversation on the
previous version around adding tests
to the patch (which I have acknowledged).
However, given the numerous changes
that the patch already has, I have
decided to send it early and collect
some feedback while I work on the
tests for the next iteration.
Given that I have been able to
test the implementation with
my local setup, I am more or less
confident that, at least, the code
is in a relatively sane state
so that no reviewing time is
wasted on broken patches.

This patch also includes:
- SHMEM_CONFIG frontend request that is
specifically meant to allow generic
vhost-user-device frontend to be able to
query VIRTIO Shared Memory settings from the
backend (as this device is generic and agnostic
of the actual backend configuration).

- MEM_READ/WRITE backend requests are
added to deal with a potential issue when having
multiple backends sharing a file descriptor.
When a backend calls SHMEM_MAP it makes
accessing to the region fail for other
backend as it is missing from their translation
table. So these requests are a fallback
for vhost-user memory translation fails.

Can you elaborate what the issue here is?

Why would SHMEM_MAP make accessing the region fail for other backends --
what makes this missing from their translation?

This issue was raised by Stefan Hajnoczi in one of the first
iterations of this patchset, based upon previous David Gilbert's work
on the virtiofs DAX Window.

Let me paste here some of his remarks:

"""
Other backends don't see these mappings. If the guest submits a vring
descriptor referencing a mapping to another backend, then that backend
won't be able to access this memory.
"""
[...]
"""
A bit more detail:

Device A has a VIRTIO Shared Memory Region. An application mmaps that
memory (examples: guest userspace driver using Linux VFIO, a guest
kernel driver that exposes the memory to userspace via mmap, or guest
kernel DAX). The application passes that memory as an I/O buffer to
device B (e.g. O_DIRECT disk I/O).

The result is that device B's vhost-user backend receives a vring
descriptor that points to a guest memory address in device A's VIRTIO
Shared Memory Region. Since device B does not have this memory in its
table, it cannot translate the address and the device breaks.
"""

I have not triggered the issue myself. So the idea is that the next
patch will *definitively* include some testing for the commits that I
cannot verify with my local setup.

Hah! But isn't that exact problem which is now solved by our rework?

Whatever is mapped in the VIRTIO Shared Memory Region will be
communicated to all other vhost-user devices. So they should have that
memory in their map and should be able to access it.

You mean the SET_MEM_TABLE message after the vhost_commit is sent to
all vhost-user devices? I was not sure, as I was testing with a single
device, that would be great, and simplify the patch a lot.

Yes, all vhost-user devices should be updated.

Then, I think I agree with you, it would seem that this approach
naturally solved the issue with address translation among different
devices, as they all get the most up-to-date memory table after each
mmap.

WDYT, @Stefan Hajnoczi ?
If we are unsure, maybe we can leave the MEM_READ/WRITE support as a
later extension, and try to integrate the rest of this patch first.

As commented offline, maybe one would want the option to enable the alternative mode, where such updates (in the SHM region) are not sent to vhost-user devices. In such a configuration, the MEM_READ / MEM_WRITE would be unavoidable.

What comes to mind are vhost-user devices with limited number of supported memslots.

No idea how relevant that really is, and how many SHM regions we will see in practice.

I recently increased the number of supported memslots for rust-vmm and libvhost-user, but not sure about other devices (in particular, dpdk, spdk, and whether we really care about that here).

--
Cheers,

David / dhildenb


Reply via email to