With this set we finally remove tb_lock. The performance gains when booting a guest are compelling at low core counts. However, beyond 8 cores performance doesn't improve due to unrelated contention--see results in the last patch of the series ("tcg: remove tb_lock").
I have another series that greatly reduces this other contention by using per-CPU locks instead of the BQL to keep track of a subset of CPUState. But that series is pretty large so let's deal with this first. You can fetch the patches from: https://github.com/cota/qemu/tree/tb-lock-removal-redux-v1 Thanks, Emilio