On Tue, Oct 8, 2024 at 2:14 AM Si-Wei Liu <si-wei....@oracle.com> wrote:
>
>
>
> On 10/7/2024 6:51 AM, Eugenio Perez Martin wrote:
> > On Fri, Oct 4, 2024 at 8:48 PM Jonah Palmer <jonah.pal...@oracle.com> wrote:
> >>
> >>
> >> On 10/4/24 11:17 AM, Eugenio Perez Martin wrote:
> >>> On Fri, Oct 4, 2024 at 2:45 PM Jonah Palmer <jonah.pal...@oracle.com> 
> >>> wrote:
> >>>> Implements the IOVA->GPA tree for handling mapping, unmapping, and
> >>>> translations for guest memory regions.
> >>>>
> >>>> When the guest has overlapping memory regions, an HVA to IOVA translation
> >>>> may return an incorrect IOVA when searching the IOVA->HVA tree. This is
> >>>> due to one HVA range being contained (overlapping) in another HVA range
> >>>> in the IOVA->HVA tree. By creating an IOVA->GPA tree, we can use GPAs to
> >>>> translate and find the correct IOVA for guest memory regions.
> >>>>
> >>> Yes, this first patch is super close to what I meant, just one issue
> >>> and a pair of nits here and there.
> >>>
> >>> I'd leave the second patch as an optimization on top, if the numbers
> >>> prove that adding the code is worth it.
> >>>
> >> Ah okay, gotcha. I also wasn't sure if what you mentioned below on the
> >> previous series you also wanted implemented or if these would also be
> >> optimizations on top.
> >>
> >> [Adding code to the vhost_iova_tree layer for handling multiple buffers
> >> returned from translation for the memory area where each iovec covers]:
> >> -----------------------------------------------------------------------
> >> "Let's say that SVQ wants to translate the HVA range
> >> 0xfeda0000-0xfedd0000. So it makes available for the device two
> >> chained buffers: One with addr=0x1000 len=0x20000 and the other one
> >> with addr=(0x20000c1000 len=0x10000).
> >>
> >> The VirtIO device should be able to translate these two buffers in
> >> isolation and chain them. Not optimal but it helps to keep QEMU source
> >> clean, as the device already must support it."
> >>
> > This is 100% in the device and QEMU is already able to split the
> > buffers that way, so we don't need any change in QEMU.
> Noted that if working with the full HVA tree directly, the internal iova
> tree linear iterator iova_tree_find_address_iterator() today doesn't
> guarantee the iova range returned can cover the entire length of the
> iov, so things could happen like that the aliased range with smaller
> size (than the requested) happens to be hit first in the linear search
> and be returned, the fragmentation of which can't be guarded against by
> the VirtIO device or the DMA API mentioned above.
>
> The second patch in this series kind of mitigated this side effect by
> sorting out the backing ram_block with the help of
> qemu_ram_block_from_host() in case of guest memory backed iov, so it
> doesn't really count on vhost_iova_gpa_tree_find_iova() to find the
> matching IOVA, but instead (somehow implicitly) avoids the fragmentation
> side effect as mentioned above would never happen. Not saying I like the
> way how it is implemented, but just wanted to point out the implication
> if the second patch has to be removed - either add special handling code
> to the iova-tree iterator sizing the range (same as how range selection
> based upon permission will be done), or add special code in SVQ layer to
> deal with fragmented IOVA ranges due to aliasing.
>

This special code in SVQ is already there. And it will be needed even
if we look for the buffers by GPA instead of by vaddr, the same way
virtqueue_map_desc needs to handle it even if it works with GPA.
Continuous GPA does not imply continuous vaddr.


Reply via email to