On Tue, 19 Dec 2023 at 10:12, Kevin Wolf <kw...@redhat.com> wrote: > > Am 04.12.2023 um 17:42 hat Stefan Hajnoczi geschrieben: > > Stop depending on the AioContext lock and instead access > > SCSIDevice->requests from only one thread at a time: > > - When the VM is running only the BlockBackend's AioContext may access > > the requests list. > > - When the VM is stopped only the main loop may access the requests > > list. > > > > These constraints protect the requests list without the need for locking > > in the I/O code path. > > > > Note that multiple IOThreads are not supported yet because the code > > assumes all SCSIRequests are executed from a single AioContext. Leave > > that as future work. > > > > Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com> > > Reviewed-by: Eric Blake <ebl...@redhat.com> > > This makes qemu-iotests 238 240 245 307 fail for me (tested with qcow2). > > The crashes are segfaults and look like below. Maybe the device has gone > away before the BH was executed? Though in theory we still hold a > reference to the object. > > Kevin > > > (gdb) bt > #0 scsi_device_for_each_req_async_bh (opaque=0x558b4a2f6e90) at > ../hw/scsi/scsi-bus.c:128 > #1 0x0000558b47e1c8e6 in aio_bh_poll (ctx=ctx@entry=0x558b4a518ef0) at > ../util/async.c:216 > #2 0x0000558b47e0764a in aio_poll (ctx=0x558b4a518ef0, > blocking=blocking@entry=true) at ../util/aio-posix.c:722 > #3 0x0000558b47cb1cd6 in iothread_run (opaque=opaque@entry=0x558b49822a60) > at ../iothread.c:63 > #4 0x0000558b47e0a6e8 in qemu_thread_start (args=0x558b4a58d5b0) at > ../util/qemu-thread-posix.c:541 > #5 0x00007f992f0ae947 in start_thread () at /lib64/libc.so.6 > #6 0x00007f992f134860 in clone3 () at /lib64/libc.so.6 > (gdb) l > 123 * If the AioContext changed before this BH was called then > reschedule into > 124 * the new AioContext before accessing ->requests. This can > happen when > 125 * scsi_device_for_each_req_async() is called and then the > AioContext is > 126 * changed before BHs are run. > 127 */ > 128 ctx = blk_get_aio_context(s->conf.blk); > 129 if (ctx != qemu_get_current_aio_context()) { > 130 aio_bh_schedule_oneshot(ctx, > scsi_device_for_each_req_async_bh, data);
I forgot that data is g_autofree. This crash can be fixed by: aio_bh_schedule_oneshot(ctx, scsi_device_for_each_req_async_bh, g_steal_pointer(&data)); > 131 return; > 132 } > (gdb) p s > $1 = (SCSIDevice *) 0x558b4a2f6 > (gdb) p *s > Cannot access memory at address 0x558b4a2f6 > >