This series adds a synchronization profiler, similar to the one described in https://www.usenix.org/system/files/conference/atc12/atc12-final237.pdf , although without using perf counters.
The profiler allows us to identify what wait times in locks/condvars are, and where they are coming from. This info is very useful to identify scalability bottlenecks imposed by locks, particularly the BQL. I have patches (currently out of tree) to switch the BQL for per-CPU locks to keep track of CPU state; the profiler was really useful when doing that work. The profiler is disabled by default, and can be enabled by configuring with --enable-sync-profiler. Overhead is pretty low though, see patch 1's commit log. You can fetch the patches from: https://github.com/cota/qemu/tree/sync-profiler Note that checkpatch gives some warnings, but they are false positives. Thanks, Emilio