On Mon, Feb 24, 2025 at 2:57 PM David Hildenbrand <da...@redhat.com> wrote:
>
> On 24.02.25 14:41, Albert Esteve wrote:
> > On Mon, Feb 24, 2025 at 10:49 AM David Hildenbrand <da...@redhat.com> wrote:
> >>
> >> On 24.02.25 10:35, Albert Esteve wrote:
> >>> On Mon, Feb 24, 2025 at 10:16 AM David Hildenbrand <da...@redhat.com> 
> >>> wrote:
> >>>>
> >>>> On 24.02.25 09:54, Albert Esteve wrote:
> >>>>> On Mon, Feb 17, 2025 at 9:01 PM David Hildenbrand <da...@redhat.com> 
> >>>>> wrote:
> >>>>>>
> >>>>>> On 17.02.25 17:40, Albert Esteve wrote:
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> looks like our debugging session was successfu :)
> >>>>>>
> >>>>>> One question below.
> >>>>>>
> >>>>>>> v3->v4
> >>>>>>> - Change mmap strategy to use RAM blocks
> >>>>>>>       and subregions.
> >>>>>>> - Add new bitfield to qmp feature map
> >>>>>>> - Followed most review comments from
> >>>>>>>       last iteration.
> >>>>>>> - Merged documentation patch again with
> >>>>>>>       this one. Makes more sense to
> >>>>>>>       review them together after all.
> >>>>>>> - Add documentation for MEM_READ/WRITE
> >>>>>>>       messages.
> >>>>>>>
> >>>>>>> The goal of this patch is to support
> >>>>>>> dynamic fd-backed memory maps initiated
> >>>>>>> from vhost-user backends.
> >>>>>>> There are many devices that could already
> >>>>>>> benefit of this feature, e.g.,
> >>>>>>> virtiofs or virtio-gpu.
> >>>>>>>
> >>>>>>> After receiving the SHMEM_MAP/UNMAP request,
> >>>>>>> the frontend creates the RAMBlock form the
> >>>>>>> fd and maps it by adding it as a subregion
> >>>>>>> of the shared memory region container.
> >>>>>>>
> >>>>>>> The VIRTIO Shared Memory Region list is
> >>>>>>> declared in the `VirtIODevice` struct
> >>>>>>> to make it generic.
> >>>>>>>
> >>>>>>> TODO: There was a conversation on the
> >>>>>>> previous version around adding tests
> >>>>>>> to the patch (which I have acknowledged).
> >>>>>>> However, given the numerous changes
> >>>>>>> that the patch already has, I have
> >>>>>>> decided to send it early and collect
> >>>>>>> some feedback while I work on the
> >>>>>>> tests for the next iteration.
> >>>>>>> Given that I have been able to
> >>>>>>> test the implementation with
> >>>>>>> my local setup, I am more or less
> >>>>>>> confident that, at least, the code
> >>>>>>> is in a relatively sane state
> >>>>>>> so that no reviewing time is
> >>>>>>> wasted on broken patches.
> >>>>>>>
> >>>>>>> This patch also includes:
> >>>>>>> - SHMEM_CONFIG frontend request that is
> >>>>>>> specifically meant to allow generic
> >>>>>>> vhost-user-device frontend to be able to
> >>>>>>> query VIRTIO Shared Memory settings from the
> >>>>>>> backend (as this device is generic and agnostic
> >>>>>>> of the actual backend configuration).
> >>>>>>>
> >>>>>>> - MEM_READ/WRITE backend requests are
> >>>>>>> added to deal with a potential issue when having
> >>>>>>> multiple backends sharing a file descriptor.
> >>>>>>> When a backend calls SHMEM_MAP it makes
> >>>>>>> accessing to the region fail for other
> >>>>>>> backend as it is missing from their translation
> >>>>>>> table. So these requests are a fallback
> >>>>>>> for vhost-user memory translation fails.
> >>>>>>
> >>>>>> Can you elaborate what the issue here is?
> >>>>>>
> >>>>>> Why would SHMEM_MAP make accessing the region fail for other backends 
> >>>>>> --
> >>>>>> what makes this missing from their translation?
> >>>>>
> >>>>> This issue was raised by Stefan Hajnoczi in one of the first
> >>>>> iterations of this patchset, based upon previous David Gilbert's work
> >>>>> on the virtiofs DAX Window.
> >>>>>
> >>>>> Let me paste here some of his remarks:
> >>>>>
> >>>>> """
> >>>>> Other backends don't see these mappings. If the guest submits a vring
> >>>>> descriptor referencing a mapping to another backend, then that backend
> >>>>> won't be able to access this memory.
> >>>>> """
> >>>>> [...]
> >>>>> """
> >>>>> A bit more detail:
> >>>>>
> >>>>> Device A has a VIRTIO Shared Memory Region. An application mmaps that
> >>>>> memory (examples: guest userspace driver using Linux VFIO, a guest
> >>>>> kernel driver that exposes the memory to userspace via mmap, or guest
> >>>>> kernel DAX). The application passes that memory as an I/O buffer to
> >>>>> device B (e.g. O_DIRECT disk I/O).
> >>>>>
> >>>>> The result is that device B's vhost-user backend receives a vring
> >>>>> descriptor that points to a guest memory address in device A's VIRTIO
> >>>>> Shared Memory Region. Since device B does not have this memory in its
> >>>>> table, it cannot translate the address and the device breaks.
> >>>>> """
> >>>>>
> >>>>> I have not triggered the issue myself. So the idea is that the next
> >>>>> patch will *definitively* include some testing for the commits that I
> >>>>> cannot verify with my local setup.
> >>>>
> >>>> Hah! But isn't that exact problem which is now solved by our rework?
> >>>>
> >>>> Whatever is mapped in the VIRTIO Shared Memory Region will be
> >>>> communicated to all other vhost-user devices. So they should have that
> >>>> memory in their map and should be able to access it.
> >>>
> >>> You mean the SET_MEM_TABLE message after the vhost_commit is sent to
> >>> all vhost-user devices? I was not sure, as I was testing with a single
> >>> device, that would be great, and simplify the patch a lot.
> >>
> >> Yes, all vhost-user devices should be updated.
> >
> > Then, I think I agree with you, it would seem that this approach
> > naturally solved the issue with address translation among different
> > devices, as they all get the most up-to-date memory table after each
> > mmap.
> >
> > WDYT, @Stefan Hajnoczi ?
> > If we are unsure, maybe we can leave the MEM_READ/WRITE support as a
> > later extension, and try to integrate the rest of this patch first.
>
> As commented offline, maybe one would want the option to enable the
> alternative mode, where such updates (in the SHM region) are not sent to
> vhost-user devices. In such a configuration, the MEM_READ / MEM_WRITE
> would be unavoidable.

At first, I remember we discussed two options, having update messages
sent to all devices (which was deemed as potentially racy), or using
MEM_READ / MEM _WRITE messages. With this version of the patch there
is no option to avoid the mem_table update messages, which brings me
to my point in the previous message: it may make sense to continue
with this patch without MEM_READ/WRITE support, and leave that and the
option to make mem_table updates optional for a followup patch?

>
> What comes to mind are vhost-user devices with limited number of
> supported memslots.
>
> No idea how relevant that really is, and how many SHM regions we will
> see in practice.

In general, from what I see they usually require 1 or 2 regions,
except for virtio-scmi which requires >256.

>
> I recently increased the number of supported memslots for rust-vmm and
> libvhost-user, but not sure about other devices (in particular, dpdk,
> spdk, and whether we really care about that here).
>
> --
> Cheers,
>
> David / dhildenb
>


Reply via email to