The completion bottom half was scheduled within the pool->lock critical section. That actually results in worse performance, because the worker thread can run its own small critical section and go to sleep before the bottom half starts running.
Note that this simple change does not produce an improvement without changing the thread pool QemuSemaphore to a condition variable. Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> --- util/thread-pool.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/util/thread-pool.c b/util/thread-pool.c index 196835b4d3..4979f30ca3 100644 --- a/util/thread-pool.c +++ b/util/thread-pool.c @@ -127,9 +127,8 @@ static void *worker_thread(void *opaque) smp_wmb(); req->state = THREAD_DONE; - qemu_mutex_lock(&pool->lock); - qemu_bh_schedule(pool->completion_bh); + qemu_mutex_lock(&pool->lock); } pool->cur_threads--; -- 2.36.0