Together, these two patches fix the performance regression induced by QemuSemaphore; individually they don't though.
The third patch is a small cleanup on top, that was enabled by the recent introduction of min_threads/max_threads knobs for the thread pool. 6.2: iops : min=58051, max=62260, avg=60282.57, stdev=1081.18, samples=30 clat percentiles (usec): 1.00th=[ 490], 99.99th=[ 775] iops : min=59401, max=61290, avg=60651.27, stdev=468.24, samples=30 clat percentiles (usec): 1.00th=[ 490], 99.99th=[ 717] iops : min=59583, max=60816, avg=60353.43, stdev=282.69, samples=30 clat percentiles (usec): 1.00th=[ 490], 99.99th=[ 701] iops : min=58099, max=60713, avg=59739.53, stdev=755.49, samples=30 clat percentiles (usec): 1.00th=[ 494], 99.99th=[ 717] patched: iops : min=60616, max=62522, avg=61654.37, stdev=555.67, samples=30 clat percentiles (usec): 1.00th=[ 474], 99.99th=[ 1303] iops : min=61841, max=63600, avg=62878.47, stdev=442.40, samples=30 clat percentiles (usec): 1.00th=[ 465], 99.99th=[ 685] iops : min=62976, max=63910, avg=63531.60, stdev=261.05, samples=30 clat percentiles (usec): 1.00th=[ 461], 99.99th=[ 693] iops : min=60803, max=63623, avg=62653.37, stdev=808.76, samples=30 clat percentiles (usec): 1.00th=[ 465], 99.99th=[ 685] Paolo v1->v2: support min_threads/max_threads Paolo Bonzini (3): thread-pool: optimize scheduling of completion bottom half thread-pool: replace semaphore with condition variable thread-pool: remove stopping variable util/thread-pool.c | 70 +++++++++++++++++----------------------------- 1 file changed, 26 insertions(+), 44 deletions(-) -- 2.36.0