On 09/10/2017 17:15, Waiman Long wrote: > On 10/09/2017 09:39 AM, Paolo Bonzini wrote: >> On 06/10/2017 23:52, Eduardo Habkost wrote: >>> This series enables kvm_pv_unhalt by default on pc-*-2.11 and >>> newer. >> >> I've discussed PV spinlocks with some folks at Microsoft for a few weeks >> now, and I'm not 100% sure about enabling kvm_pv_unhalt by default. >> It's probably a good idea overall, but it does come with some caveats. >> >> The problem is that there were two different implementations of fair >> spinlocks in Linux, ticket spinlocks and queued spinlocks. When >> kvm_pv_unhalt is disabled, ticket spinlocks sucked badly indeed; queued >> spinlocks however simply revert to unfair spinlocks, which loses the >> fairness but has the best performance. See virt_spin_lock in >> arch/x86/include/asm/qspinlock.h. >> >> Now, fair spinlocks are only really needed for large NUMA machines. >> With a single NUMA node, for example, test-and-set spinlocks work well >> enough; there's not _much_ need for fairness in practice, and the >> paravirtualization does introduce some overhead. >> >> Therefore, the best performance would be achieved with kvm_pv_unhalt >> disabled on small VMs, and enabled on large VMs spanning multiple host >> NUMA nodes. >> >> Waiman, Davidlohr, do you have an opinion on this as well? > > I agree. Using unfair lock in a small VM is good for performance. The > only problem I see is how do we define what is small. Now, even a > machine with a single socket can have up to 28 cores, 56 threads. If a > VM has up to 56 vCPUs, we may still want pvqspinlock to be used.
Yes. Another possibility is to enable it when there is >1 NUMA node in the guest. We generally don't do this kind of magic but higher layers (oVirt/OpenStack) do. It also depends on how worse performance is with PV qspinlocks, especially since now there's also vcpu_is_preempted that has to be thrown in the mix. A lot more testing is needed. > Is the kvm_pv_unhalt flag a global one for all VMs within a machine? Or > can it be different for each VM? We may want to have this flag > dynamically determined depending on the configuration of the VM. It's per-VM, so that's easy. Paolo