Roman Kagan <rka...@virtuozzo.com> writes:
> I came across the following AB-BA deadlock: > > vCPU thread main thread > ----------- ----------- > async_safe_run_on_cpu(self, > async_synic_update) > ... [cpu hot-add] > process_queued_cpu_work() > qemu_mutex_unlock_iothread() > [grab BQL] > start_exclusive() cpu_list_add() > async_synic_update() finish_safe_work() > qemu_mutex_lock_iothread() cpu_exec_start() > > ATM async_synic_update seems to be the only async safe work item that > grabs BQL. However it isn't quite obvious that it shouldn't; in the > past there were more examples of this (e.g. > memory_region_do_invalidate_mmio_ptr). > > It looks like the problem is generally in the lack of the nesting rule > for cpu-exclusive sections against BQL, so I thought I would try to > address that. This patchset is my feeble attempt at this; I'm not sure > I fully comprehend all the consequences (rather, I'm sure I don't) hence > RFC. Hmm I think this is an area touched by: Subject: [PATCH v7 00/73] per-CPU locks Date: Mon, 4 Mar 2019 13:17:00 -0500 Message-Id: <20190304181813.8075-1-c...@braap.org> which has stalled on it's path into the tree. Last time I checked it explicitly handled the concept of work that needed the BQL and work that didn't. How do you trigger your deadlock? Just hot-pluging CPUs? > > Roman Kagan (2): > cpus-common: nuke finish_safe_work > cpus-common: assert BQL nesting within cpu-exclusive sections > > cpus-common.c | 12 ++++-------- > 1 file changed, 4 insertions(+), 8 deletions(-) -- Alex Bennée