Dmitry Osipenko <dmitry.osipe...@collabora.com> writes: > On 4/28/25 15:59, Alex Bennée wrote: >> From: Manos Pitsidianakis <manos.pitsidiana...@linaro.org> >> >> This commit fixes an indefinite hang when using VIRTIO GPU blob objects >> under TCG in certain conditions. >> >> The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a >> MemoryRegion and attaches it to an offset on a PCI BAR of the >> VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps >> it. >> >> Because virglrenderer commands are not thread-safe they are only >> called on the main context and QEMU performs the cleanup in three steps >> to prevent a use-after-free scenario where the guest can access the >> region after it’s unmapped: >> >> 1. From the main context, the region’s field finish_unmapping is false >> by default, so it sets a variable cmd_suspended, increases the >> renderer_blocked variable, deletes the blob subregion, and unparents >> the blob subregion causing its reference count to decrement. >> >> 2. From an RCU context, the MemoryView gets freed, the FlatView gets >> recalculated, the free callback of the blob region >> virtio_gpu_virgl_hostmem_region_free is called which sets the >> region’s field finish_unmapping to true, allowing the main thread >> context to finish replying to the command >> >> 3. From the main context, the command is processed again, but this time >> finish_unmapping is true, so virgl_renderer_resource_unmap can be >> called and a response is sent to the guest. >> >> It happens so that under TCG, if the guest has no timers configured (and >> thus no interrupt will cause the CPU to exit), the RCU thread does not >> have enough time to grab the locks and recalculate the FlatView. >> >> That’s not a big problem in practice since most guests will assume a >> response will happen later in time and go on to do different things, >> potentially triggering interrupts and allowing the RCU context to run. >> If the guest waits for the unmap command to complete though, it blocks >> indefinitely. Attaching to the QEMU monitor and force quitting the guest >> allows the cleanup to continue. >> >> There's no reason why the FlatView recalculation can't occur right away >> when we delete the blob subregion, however. It does not, because when we >> create the subregion we set the object as its own parent: >> >> memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data); >> >> The extra reference is what prevents freeing the memory region object in >> the memory transaction of deleting the subregion. >> >> This commit changes the owner object to the device, which removes the >> extra owner reference in the memory region and causes the MR to be >> freed right away in the main context. >> >> Acked-by: Michael S. Tsirkin <m...@redhat.com> >> Signed-off-by: Manos Pitsidianakis <manos.pitsidiana...@linaro.org> >> Reviewed-by: Alex Bennée <alex.ben...@linaro.org> >> Tested-by: Alex Bennée <alex.ben...@linaro.org> >> Message-Id: <20250410122643.1747913-3-manos.pitsidiana...@linaro.org> >> Cc: qemu-sta...@nongnu.org >> --- >> hw/display/virtio-gpu-virgl.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c >> index 71a7500de9..8fbe4e70cc 100644 >> --- a/hw/display/virtio-gpu-virgl.c >> +++ b/hw/display/virtio-gpu-virgl.c >> @@ -112,7 +112,7 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g, >> vmr->g = g; >> mr = g_new0(MemoryRegion, 1); >> >> - memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data); >> + memory_region_init_ram_ptr(mr, OBJECT(g), "blob", size, data); >> memory_region_add_subregion(&b->hostmem, offset, mr); >> memory_region_set_enabled(mr, true); >> > > This change makes QEMU to crash.
What is your command line to cause the crash? > > AFAICT, it effectively reverts code to old bugged version [1] that was > rejected in the past. > > +Akihiko Odaki > > [1] > https://lore.kernel.org/qemu-devel/20230915111130.24064-10-ray.hu...@amd.com/ -- Alex Bennée Virtualisation Tech Lead @ Linaro