On Thu, 29 Feb 2024 11:22:24 +0100 Heinrich Schuchardt <heinrich.schucha...@canonical.com> wrote:
> On 29.02.24 02:11, Peter Xu wrote: > > On Wed, Feb 28, 2024 at 08:07:47PM +0100, Heinrich Schuchardt wrote: > >> On 28.02.24 19:39, Peter Maydell wrote: > >>> On Wed, 28 Feb 2024 at 18:28, Heinrich Schuchardt > >>> <heinrich.schucha...@canonical.com> wrote: > >>>> > >>>> On 28.02.24 16:06, Philippe Mathieu-Daudé wrote: > >>>>> Hi Heinrich, > >>>>> > >>>>> On 28/2/24 13:59, Heinrich Schuchardt wrote: > >>>>>> virtqueue_map_desc() is called with values of sz exceeding that may > >>>>>> exceed > >>>>>> TARGET_PAGE_SIZE. sz = 0x2800 has been observed. > > > > Pure (and can also be stupid) question: why virtqueue_map_desc() would map > > to !direct mem? Shouldn't those buffers normally allocated from guest RAM? > > > >>>>>> > >>>>>> We only support a single bounce buffer. We have to avoid > >>>>>> virtqueue_map_desc() calling address_space_map() multiple times. > >>>>>> Otherwise > >>>>>> we see an error > >>>>>> > >>>>>> qemu: virtio: bogus descriptor or out of resources > >>>>>> > >>>>>> Increase the minimum size of the bounce buffer to 0x10000 which matches > >>>>>> the largest value of TARGET_PAGE_SIZE for all architectures. > >>>>>> > >>>>>> Signed-off-by: Heinrich Schuchardt <heinrich.schucha...@canonical.com> > >>>>>> --- > >>>>>> v2: > >>>>>> remove unrelated change > >>>>>> --- > >>>>>> system/physmem.c | 8 ++++++-- > >>>>>> 1 file changed, 6 insertions(+), 2 deletions(-) > >>>>>> > >>>>>> diff --git a/system/physmem.c b/system/physmem.c > >>>>>> index e3ebc19eef..3c82da1c86 100644 > >>>>>> --- a/system/physmem.c > >>>>>> +++ b/system/physmem.c > >>>>>> @@ -3151,8 +3151,12 @@ void *address_space_map(AddressSpace *as, > >>>>>> *plen = 0; > >>>>>> return NULL; > >>>>>> } > >>>>>> - /* Avoid unbounded allocations */ > >>>>>> - l = MIN(l, TARGET_PAGE_SIZE); > >>>>>> + /* > >>>>>> + * There is only one bounce buffer. The largest occuring > >>>>>> value of > >>>>>> + * parameter sz of virtqueue_map_desc() must fit into the > >>>>>> bounce > >>>>>> + * buffer. > >>>>>> + */ > >>>>>> + l = MIN(l, 0x10000); > >>>>> > >>>>> Please define this magic value. Maybe ANY_TARGET_PAGE_SIZE or > >>>>> TARGETS_BIGGEST_PAGE_SIZE? > >>>>> > >>>>> Then along: > >>>>> QEMU_BUILD_BUG_ON(TARGET_PAGE_SIZE <= TARGETS_BIGGEST_PAGE_SIZE); > >>>> > >>>> Thank you Philippe for reviewing. > >>>> > >>>> TARGETS_BIGGEST_PAGE_SIZE does not fit as the value is not driven by the > >>>> page size. > >>>> How about MIN_BOUNCE_BUFFER_SIZE? > >>>> Is include/exec/memory.h the right include for the constant? > >>>> > >>>> I don't think that TARGET_PAGE_SIZE has any relevance for setting the > >>>> bounce buffer size. I only mentioned it to say that we are not > >>>> decreasing the value on any existing architecture. > >>>> > >>>> I don't know why TARGET_PAGE_SIZE ever got into this piece of code. > >>>> e3127ae0cdcd ("exec: reorganize address_space_map") does not provide a > >>>> reason for this choice. Maybe Paolo remembers. > >>> > >>> The limitation to a page dates back to commit 6d16c2f88f2a in 2009, > >>> which was the first implementation of this function. I don't think > >>> there's a particular reason for that value beyond that it was > >>> probably a convenient value that was assumed to be likely "big enough". > >>> > >>> I think the idea with this bounce-buffer has always been that this > >>> isn't really a code path we expected to end up in very often -- > >>> it's supposed to be for when devices are doing DMA, which they > >>> will typically be doing to memory (backed by host RAM), not > >>> devices (backed by MMIO and needing a bounce buffer). So the > >>> whole mechanism is a bit "last fallback to stop things breaking > >>> entirely". > >>> > >>> The address_space_map() API says that it's allowed to return > >>> a subset of the range you ask for, so if the virtio code doesn't > >>> cope with the minimum being set to TARGET_PAGE_SIZE then either > >>> we need to fix that virtio code or we need to change the API > >>> of this function. (But I think you will also get a reduced > >>> range if you try to use it across a boundary between normal > >>> host-memory-backed RAM and a device MemoryRegion.) > >> > >> If we allow a bounce buffer only to be used once (via the in_use flag), why > >> do we allow only a single bounce buffer? > >> > >> Could address_space_map() allocate a new bounce buffer on every call and > >> address_space_unmap() deallocate it? > >> > >> Isn't the design with a single bounce buffer bound to fail with a > >> multi-threaded client as collision can be expected? > > > > See: > > > > https://lore.kernel.org/r/20240212080617.2559498-1-mniss...@rivosinc.com > > > > For some reason that series didn't land, but it seems to be helpful in this > > case too if e.g. there can be multiple of such devices. > > > > Thanks, > > > > Hello Peter Xu, > > thanks for pointing to your series. What I like about it is that it > removes the limit of a single bounce buffer per AddressSpace. > > Unfortunately it does not solve my problem. You limit the sum of all of > the allocations for a single AddressSpcace to > DEFAULT_MAX_BOUNCE_BUFFER_SIZE = 4096 which is too small for my use case. Agreed. For CXL memory (which indeed hits this with virtio-blk-pci for example) I'm carrying a far from subtle patch [1] on top of the series linked above (which I would very much like to land asap!) Note I also need: https://lore.kernel.org/qemu-devel/20240215142817.1904-1-jonathan.came...@huawei.com/#t [PATCH 0/3] physmem: Fix MemoryRegion for second access to cached MMIO Address Space otherwise the second 8 byte read comes from the wrong address space. That used to accidentally trigger very large reads hence the rather excessive size set below. I haven't checked yet if a smaller size is fine as obviously something as hacky as the below isn't going anywhere! Perhaps we need a rather more global version of x-max-bounce-buffers from Mattias' series? [1] diff --git a/system/physmem.c b/system/physmem.c index 0a838f1b33..10e9d8da86 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -2554,6 +2554,7 @@ static void memory_map_init(void) memory_region_init(system_memory, NULL, "system", UINT64_MAX); address_space_init(&address_space_memory, system_memory, "memory"); + address_space_memory.max_bounce_buffer_size = 1024 * 1024 * 1024; system_io = g_malloc(sizeof(*system_io)); memory_region_init_io(system_io, NULL, &unassigned_io_ops, NULL, "io", 65536); > > Why do we need a limit? > Why is it so tiny? > > Best regards > > Heinrich > > > >