On 2025/06/07 0:02, Akihiko Odaki wrote:


On 2025/06/06 19:16, Alex Bennée wrote:
Akihiko Odaki <akihiko.od...@daynix.com> writes:

On 2025/06/05 20:57, Alex Bennée wrote:
Akihiko Odaki <akihiko.od...@daynix.com> writes:

On 2025/06/03 20:01, Alex Bennée wrote:
QOM objects can be embedded in other QOM objects and managed as part
of their lifetime but this isn't the case for
virtio_gpu_virgl_hostmem_region. However before we can split it out we
need some other way of associating the wider data structure with the
memory region.
Fortunately MemoryRegion has an opaque pointer. This is passed down
to
MemoryRegionOps for device type regions but is unused in the
memory_region_init_ram_ptr() case. Use the opaque to carry the
reference and allow the final MemoryRegion object to be reaped when
its reference count is cleared.
Signed-off-by: Manos Pitsidianakis <manos.pitsidiana...@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <phi...@linaro.org>
Signed-off-by: Alex Bennée <alex.ben...@linaro.org>
Message-Id: <20250410122643.1747913-2-manos.pitsidiana...@linaro.org>
Cc: qemu-sta...@nongnu.org
---
    include/system/memory.h       |  1 +
    hw/display/virtio-gpu-virgl.c | 23 ++++++++---------------
    2 files changed, 9 insertions(+), 15 deletions(-)
diff --git a/include/system/memory.h b/include/system/memory.h
index fc35a0dcad..90715ff44a 100644
--- a/include/system/memory.h
+++ b/include/system/memory.h
@@ -784,6 +784,7 @@ struct MemoryRegion {
        DeviceState *dev;
          const MemoryRegionOps *ops;
+    /* opaque data, used by backends like @ops */
        void *opaque;
        MemoryRegion *container;
        int mapped_via_alias; /* Mapped via an alias, container might be NULL */ diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio- gpu-virgl.c
index 145a0b3879..71a7500de9 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -52,17 +52,11 @@ virgl_get_egl_display(G_GNUC_UNUSED void *cookie)
      #if VIRGL_VERSION_MAJOR >= 1
    struct virtio_gpu_virgl_hostmem_region {
-    MemoryRegion mr;
+    MemoryRegion *mr;
        struct VirtIOGPU *g;
        bool finish_unmapping;
    };
    -static struct virtio_gpu_virgl_hostmem_region *
-to_hostmem_region(MemoryRegion *mr)
-{
-    return container_of(mr, struct virtio_gpu_virgl_hostmem_region, mr);
-}
-
    static void virtio_gpu_virgl_resume_cmdq_bh(void *opaque)
    {
        VirtIOGPU *g = opaque;
@@ -73,14 +67,12 @@ static void virtio_gpu_virgl_resume_cmdq_bh(void *opaque)
    static void virtio_gpu_virgl_hostmem_region_free(void *obj)
    {
        MemoryRegion *mr = MEMORY_REGION(obj);
-    struct virtio_gpu_virgl_hostmem_region *vmr;
+    struct virtio_gpu_virgl_hostmem_region *vmr = mr->opaque;
        VirtIOGPUBase *b;
        VirtIOGPUGL *gl;
    -    vmr = to_hostmem_region(mr);
-    vmr->finish_unmapping = true;
-
        b = VIRTIO_GPU_BASE(vmr->g);
+    vmr->finish_unmapping = true;
        b->renderer_blocked--;
          /*
@@ -118,8 +110,8 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
          vmr = g_new0(struct virtio_gpu_virgl_hostmem_region, 1);
        vmr->g = g;
+    mr = g_new0(MemoryRegion, 1);

This patch does nothing more than adding a separate allocation for
MemoryRegion. Besides there is no corresponding g_free(). This patch
can be simply dropped.
As the patch says the MemoryRegion is now free'd when it is
de-referenced. Do you have a test case showing it leaking?

"De-referenced" is confusing and sounds like pointer dereferencing.

OBJECT(mr)->free, which has virtio_gpu_virgl_hostmem_region_free() as
its value, will be called to free mr when the references of mr are
removed. This patch however does not add a corresponding g_free() call
to virtio_gpu_virgl_hostmem_region_free(), leaking mr.

AddressSanitizer will catch the memory leak.

Example invocation?

I ran the AddressSantizier against all the virtio-gpu tests yesterday
and it did not complain.

The following command line triggered the memory leak. The image is a clean Debian 12 installation. I booted the installation, and shut down it by pressing the button on the booted GDM:

build/qemu-system-x86_64 -drive file=debian12.qcow2 -m 8G -smp 8 -device virtio-vga-gl,blob=on,hostmem=1G -display egl-headless,gl=on -vnc :0 -M q35,accel=kvm ==361968==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==361968==WARNING: ASan is ignoring requested __asan_handle_no_return: stack type: default top: 0x7bf41d2b8380; bottom 0x7bf2e0f4c000; size: 0x00013c36c380 (5305189248)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189

=================================================================
==361968==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 816 byte(s) in 3 object(s) allocated from:
    #0 0x7ff640f50a43 in calloc (/lib64/libasan.so.8+0xe6a43) (BuildId: 6a82bb83b1f19d3f3a2118085acf79daa3b52371)     #1 0x7ff64077c901 in g_malloc0 (/lib64/libglib-2.0.so.0+0x48901) (BuildId: 6827394d759bc44f207f57e7ab5f8e6b17e82c1c)     #2 0x557fa8080dc5 in virtio_gpu_virgl_map_resource_blob ../hw/ display/virtio-gpu-virgl.c:113     #3 0x557fa8080dc5 in virgl_cmd_resource_map_blob ../hw/display/ virtio-gpu-virgl.c:772     #4 0x557fa8080dc5 in virtio_gpu_virgl_process_cmd ../hw/display/ virtio-gpu-virgl.c:952

SUMMARY: AddressSanitizer: 816 byte(s) leaked in 3 allocation(s).

The following command line also reproduced the issue:

LIBGL_ALWAYS_SOFTWARE=1 \
LSAN_OPTIONS=suppressions=<(echo leak:fontconfig) \
build/qemu-system-aarch64 -M virt,accel=kvm -cpu host -smp 8 -m 8G \
-device virtio-gpu-gl,blob=on,hostmem=256M -display gtk,gl=on \
-drive file=Fedora-Workstation-Live-42-1.1.aarch64.iso,format=raw \
-bios /usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.raw \
-serial mon:stdio

Fedora's Live disk allows you to test without installation.

By the way, I tried TCG to reproduce the hang, but I couldn't. I'd appreciate if you tell:

- how to reproduce the issue
- whether the CPU usage saturates with 100 %
  (i.e., if the hang is a busy-loop or not).
- the stack traces of all the threads when the hang happens.

Hopefully they will provide more insights into the problem and your fix.

Regards,
Akihiko Odaki

Reply via email to