In this series, we want to remove the AioContext lock and instead use the already existent job_mutex to protect the job structures and list. This is part of the work to get rid of AioContext lock usage in favour of smaller granularity locks.
In patches 1-3-5-6-7, we split the job API in two headers: job-driver.h and job-monitor.h. As explained in job.c, job-monitor are the functions mainly used by the monitor, and require consistency between the search of a specific job (job_get) and the actual operation/action on it (e.g. job_user_cancel). Therefore job-monitor API assume that the job mutex lock is always held by the caller. job-driver, on the other side, is the collection of functions that are used by the job drivers or core block layer. These functions are not aware of the job mutex, and delegate the locking to the callee instead. We also have job-common.h contains the job struct definition and common functions that are not part of monitor or driver APIs. job.h is left for legacy and to avoid changing all files that include it. After we split the job API, we have a couple of patches that try to prepare and simplify the aiocontext lock replacement with job lock, namely patch 10 introduces AIO_WAIT_WHILE_UNLOCKED (same as the original AIO_WAIT_WHILE but does not release and re-acquires the aiocontext lock if provided). Patch 12 uses job_lock/unlock to protect the job fields and shared structures. Note that this patch just adds the job locks, and in order to simplify the rewiew, does remove any aiocontext lock, creating deadlocks. Patch 13 takes care of the unit tests. The reason why it does not handle possible deadlocks is because doing so would just add additional job_lock/unlock that would still be deleted in next patches together with the aiocontext lock. RFC: not sure if the current patch organization is correct. Bisecting in patches in between this serie would cause tests to deadlock. Example: currently patch 12 has this code static void job_exit(void *opaque) { Job *job = (Job *)opaque; + job_lock(); job_ref(); // assumes the lock held aio_context_acquire(); and then on patch 15: static void job_exit(void *opaque) { Job *job = (Job *)opaque; job_lock(); - job_ref(); // assumes the lock held - aio_context_acquire(); This is not ideal in patch 12, because taking the aiocontext lock after the job lock is already held can cause deadlocks (in the worst case we might want the opposite order), but in order to keep every patch clean we would need to do 2 patches: static void job_exit(void *opaque) { Job *job = (Job *)opaque; + job_lock(); job_ref(); // assumes the lock held + job_unlock(); aio_context_acquire(); + job_lock(); and once the aiocontext is removed: static void job_exit(void *opaque) { Job *job = (Job *)opaque; job_lock(); - job_ref(); // assumes the lock held - job_unlock(); - aio_context_acquire(); - job_lock(); which seems unnecessary. Patch 14 instead starts replacing some aiocontext with job_locks, and patch 15 takes care of completely eradicating them from the job API (with the exception of driver->free()). Again the two patches are kept separated to simplify the life of the reviewer. Tested this series by running unit tests, qemu-iotests and qtests (x86_64). This serie is based on my previous series "block layer: split block APIs in global state and I/O". Based-on: <20211025101735.2060852-1-eespo...@redhat.com> Signed-off-by: Emanuele Giuseppe Esposito <eespo...@redhat.com> Emanuele Giuseppe Esposito (15): jobs: add job-common.h job.c: make job_lock/unlock public job-common.h: categorize fields in struct Job jobs: add job-monitor.h job-monitor.h: define the job monitor API jobs: add job-driver.h job-driver.h: add helper functions job.c: minor adjustments in preparation to job-driver job.c: move inner aiocontext lock in callbacks aio-wait.h: introduce AIO_WAIT_WHILE_UNLOCKED jobs: remove aiocontext locks since the functions are under BQL jobs: protect jobs with job_lock/unlock jobs: use job locks and helpers also in the unit tests jobs: add missing job locks to replace aiocontext lock jobs: remove all unnecessary AioContext locks include/block/aio-wait.h | 15 +- include/qemu/job-common.h | 336 ++++++++++++++++++ include/qemu/job-driver.h | 173 ++++++++++ include/qemu/job-monitor.h | 275 +++++++++++++++ include/qemu/job.h | 569 +------------------------------ block.c | 6 + block/mirror.c | 8 +- block/replication.c | 6 + blockdev.c | 88 ++--- blockjob.c | 62 ++-- job-qmp.c | 54 ++- job.c | 413 ++++++++++++++++------ monitor/qmp-cmds.c | 2 + qemu-img.c | 8 +- tests/unit/test-bdrv-drain.c | 44 +-- tests/unit/test-block-iothread.c | 6 +- tests/unit/test-blockjob-txn.c | 10 + tests/unit/test-blockjob.c | 68 ++-- 18 files changed, 1303 insertions(+), 840 deletions(-) create mode 100644 include/qemu/job-common.h create mode 100644 include/qemu/job-driver.h create mode 100644 include/qemu/job-monitor.h -- 2.27.0