I thought I'd share a trick for improving the performance on win7 guests. The tl;dr version is add <feature policy='disable' name='hypervisor'/> to the <cpu> section of your libvirt xml like so: <cpu mode='host-passthrough'> <topology sockets='1' cores='3' threads='1'/> <feature policy='disable' name='hypervisor'/> </cpu>
The long story is that according to Microsoft's documentation "On systems where the TSC is not suitable for timekeeping, Windows automatically selects a platform counter (either the HPET timer or the ACPI PM timer) as the basis for QPC." QPC = QueryPerformanceCounter() which is a windows api for getting timing info. Some redhat documentation say: "Windows 7 do not use the TSC as a time source if the hypervisor-present bit is set". Instead if falls back on acpi_pm or hpet if hpet is enabled in the xml. The hypervisor present bit is a fake cpuid flag qemu and other hypervisors injects to show the guest it's running under a hypervisor. This is different from the KVM signature that can be hidden with <kvm><hidden state='on'>. With the hypervisor flag disabled in libvirt xml windows 7 started using TSC as timing source for me. Nvidia has a "Timer Function Performance" benchmark on their web page to measure overhead from timers. With acpi_pm the timer query took 3,605ns on average and with TSC 12.52ns. Passmark's CPU floating point performance benchmark, which query timers 265,000 times/sec, went from 3952 points with acpi_pm to 5594 points with TSC. The reason TSC is so much faster is because both acpi_pm and hpet are emulated by qemu in userspace and TSC is handled by KVM in kernel space. All games I've tested use the timer at least 25,000 times/sec. I'm guessing it's the graphics drivers doing that. Some games like Outlast query the timer ~275,000 times/sec. The performance for those games are basically limited by how fast the host can do context switches. I expect the performance improvement with TSC is great in those games. Unfortunately 3dmark's fire strike benchmark still do 25,000 queries/sec to the acpi_pm even with the hypervisor flag hidden. There must be some other windows api for using the "platform counter" as Microsoft calls it but most games don't use it. Unless you are using windows 7 you'll probably not benefit from this. Windows 10 is probably using the hypervclock instead. That redhat documentation talking about the hypervisor bit was actually a guide for how to turn off TSC to "resolve guest timing issues". I don't experience any problems myself but if you got one of those "clocksource tsc unstableā systems this might not work so well. _______________________________________________ vfio-users mailing list vfio-users@redhat.com https://www.redhat.com/mailman/listinfo/vfio-users