On Tue, Jun 23, 2015 at 05:32:42PM +0200, Michael S. Tsirkin wrote: > On Tue, Jun 23, 2015 at 12:08:28PM -0300, Eduardo Habkost wrote: > > On Tue, Jun 23, 2015 at 02:32:00PM +0200, Andreas Färber wrote: > > > Am 08.06.2015 um 22:18 schrieb Jiri Denemark: > > > >> To help libvirt in the transition, a x86-cpu-model-dump script is > > > >> provided, > > > >> that will generate a config file that can be loaded using -readconfig, > > > >> based on > > > >> the -cpu and -machine options provided in the command-line. > > > > > > > > Thanks Eduardo, I never was a big fan of moving (or copying) all the CPU > > > > configuration data to libvirt, but now I think it actually makes sense. > > > > We already have a partial copy of CPU model definitions in libvirt > > > > anyway, but as QEMU changes some CPU models in some machine types (and > > > > libvirt does not do that) we have no real control over the guest CPU > > > > configuration. While what we really want is full control to enforce > > > > stable guest ABI. > > > > > > That sounds like FUD to me. Any concrete data points where QEMU does not > > > have a stable ABI for x86 CPUs? That's what we have the pc*-x.y machines > > > for. > > > > What Jiri is saying that the CPUs change depending on -mmachine, not > > that the ABI is broken by a given machine. > > > > The problem here is that libvirt needs to provide CPU models whose > > runnability does not depend on the machine-type. If users have a VM that > > is running in a host and the VM machine-type changes, > > How does it change, and why?
Sometimes we add features to a CPU model because they were not emulated by KVM and now they are. Sometimes we remove or add features or change other fields because we are fixing previous mistakes. Recently we we were going to remove features from models because of an Intel CPU errata, but then decided to create a new CPU model name instead. See some examples at the end of this message. > > > the VM should be > > still runnable in that host. QEMU doesn't provide that, our CPU models > > may change when we introduce new machine-types, so we are giving them a > > mechanism that allows libvirt to implement the policy they need. > > I don't mind wrt CPU specifically, but we absolutely do change guest ABI > in many ways when we change machine types. All the other ABI changes we introduce in QEMU don't affect runnability of the VM in a given host, that's the problem we are trying to address here. ABI changes are expected when changing to a new machine, runnability changes aren't. Examples of commits changing CPU models: commit 726a8ff68677d8d5fba17eb0ffb85076bfb598dc Author: Eduardo Habkost <ehabk...@redhat.com> Date: Fri Apr 10 14:45:00 2015 -0300 target-i386: Remove AMD feature flag aliases from CPU model table When CPU vendor is AMD, the AMD feature alias bits on CPUID[0x80000001].EDX are already automatically copied from CPUID[1].EDX on x86_cpu_realizefn(). When CPU vendor is Intel, those bits are reserved and should be zero. On either case, those bits shouldn't be set in the CPU model table. Reviewed-by: Igor Mammedov <imamm...@redhat.com> Signed-off-by: Eduardo Habkost <ehabk...@redhat.com> commit 13704e4c455770d500d6b87b117e32f0d01252c9 Author: Eduardo Habkost <ehabk...@redhat.com> Date: Thu Jan 22 17:22:54 2015 -0200 target-i386: Disable HLE and RTM on Haswell & Broadwell All Haswell CPUs and some Broadwell CPUs were updated by Intel to have the HLE and RTM features disabled. This will prevent "-cpu Haswell,enforce" and "-cpu Broadwell,enforce" from running out of the box on those CPUs. Disable those features by default on Broadwell and Haswell CPU models, starting on pc-*-2.3. Users who want to use those features can enable them explicitly on the command-line. Signed-off-by: Eduardo Habkost <ehabk...@redhat.com> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> commit 78a611f1936b3eac8ed78a2be2146a742a85212c Author: Paolo Bonzini <pbonz...@redhat.com> Date: Fri Dec 5 10:52:46 2014 +0100 target-i386: add f16c and rdrand to Haswell and Broadwell Both were added in Ivy Bridge (for which we do not have a CPU model yet!). Reviewed-by: Eduardo Habkost <ehabk...@redhat.com> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> commit b3a4f0b1a072a467d003755ca0e55c5be38387cb Author: Paolo Bonzini <pbonz...@redhat.com> Date: Wed Dec 10 14:12:41 2014 -0200 target-i386: add VME to all CPUs vm86 mode extensions date back to the 486. All models should have them. Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> Signed-off-by: Eduardo Habkost <ehabk...@redhat.com> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> commit 0bb0b2d2fe7f645ddaf1f0ff40ac669c9feb4aa1 Author: Paolo Bonzini <pbonz...@redhat.com> Date: Mon Nov 24 15:54:43 2014 +0100 target-i386: add feature flags for CPUID[EAX=0xd,ECX=1] These represent xsave-related capabilities of the processor, and KVM may or may not support them. Add feature bits so that they are considered by "-cpu ...,enforce", and use the new feature work instead of calling kvm_arch_get_supported_cpuid. Bit 3 (XSAVES) is not migratables because it requires saving MSR_IA32_XSS. Neither KVM nor any commonly available hardware supports it anyway. Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> commit e93abc147fa628650bdbe7fd57f27462ca40a3c2 Author: Eduardo Habkost <ehabk...@redhat.com> Date: Fri Oct 3 16:39:50 2014 -0300 target-i386: Don't enable nested VMX by default TCG doesn't support VMX, and nested VMX is not enabled by default in the KVM kernel module. So, there's no reason to have VMX enabled by default on the core2duo and coreduo CPU models, today. Even the newer Intel CPU model definitions don't have it enabled. In this case, we need machine-type compat code, as people may be running the older machine-types on hosts that had VMX nesting enabled. Signed-off-by: Eduardo Habkost <ehabk...@redhat.com> Signed-off-by: Andreas Färber <afaer...@suse.de> commit f8e6a11aecc96e9d8a84f17d7c07019471714e20 Author: Eduardo Habkost <ehabk...@redhat.com> Date: Tue Sep 10 17:48:59 2013 -0300 target-i386: Set model=6 on qemu64 & qemu32 CPU models There's no Intel CPU with family=6,model=2, and Linux and Windows guests disable SEP when seeing that combination due to Pentium Pro erratum #82. In addition to just having SEP ignored by guests, Skype (and maybe other applications) runs sysenter directly without passing through ntdll on Windows, and crashes because Windows ignored the SEP CPUID bit. So, having model > 2 is a better default on qemu64 and qemu32 for two reasons: making SEP really available for guests, and avoiding crashing applications that work on bare metal. model=3 would fix the problem, but it causes CPU enumeration problems for Windows guests[1]. So let's set model=6, that matches "Athlon (PM core)" on AMD and "P2 with on-die L2 cache" on Intel and it allows Windows to use all CPUs as well as fixing sysenter. [1] https://bugzilla.redhat.com/show_bug.cgi?id=508623 Cc: Andrea Arcangeli <aarca...@redhat.com> Signed-off-by: Eduardo Habkost <ehabk...@redhat.com> Reviewed-by: Igor Mammedov <imamm...@redhat.com> Signed-off-by: Andreas Färber <afaer...@suse.de> commit 6b11322e0f724eb0649fdc324a44288b783023ad Author: Eduardo Habkost <ehabk...@redhat.com> Date: Mon May 27 17:23:55 2013 -0300 target-i386: Set level=4 on Conroe/Penryn/Nehalem The CPUID level value on Conroe, Penryn, and Nehalem are too low. This causes at least one known problem: the -smp "threads" option doesn't work as expect if level is < 4, because thread count information is provided to the guest on CPUID[EAX=4,ECX=2].EAX Signed-off-by: Eduardo Habkost <ehabk...@redhat.com> Reviewed-by: Igor Mammedov <imamm...@redhat.com> Signed-off-by: Andreas Färber <afaer...@suse.de> commit ffce9ebbb69363dfe7605585cdad58ea3847edf4 Author: Eduardo Habkost <ehabk...@redhat.com> Date: Mon May 27 17:23:54 2013 -0300 target-i386: Update model values on Conroe/Penryn/Nehalem CPU models The CPUID model values on Conroe, Penryn, and Nehalem are too conservative and don't reflect the values found on real Conroe, Penryn, and Nehalem CPUs. This causes at least one known problems: Windows XP disables sysenter when (family == 6 && model <= 2), but Skype tries to use the sysenter instruction anyway because it is reported as available on CPUID, making it crash. This patch sets appropriate model values that correspond to real Conroe, Penryn, and Nehalem CPUs. Signed-off-by: Eduardo Habkost <ehabk...@redhat.com> Reviewed-by: Igor Mammedov <imamm...@redhat.com> Signed-off-by: Andreas Färber <afaer...@suse.de> commit b2a856d99281f2fee60a4313d204205bcd2c4269 Author: Andreas Färber <afaer...@suse.de> Date: Wed May 1 17:30:51 2013 +0200 target-i386: Change CPUID model of 486 to 8 This changes the model number of 486 to 8 (DX4) which matches the feature set presented, and actually has the CPUID instruction. This adds a compatibility property, to keep model=0 on pc-*-1.4 and older. Signed-off-by: H. Peter Anvin <h...@zytor.com> [AF: Add compat_props entry] Tested-by: Eduardo Habkost <ehabk...@redhat.com> Reviewed-by: Eduardo Habkost <ehabk...@redhat.com> Signed-off-by: Andreas Färber <afaer...@suse.de> KVM-specific changes: commit 75d373ef9729bd22fbc46bfd8dcd158cbf6d9777 Author: Eduardo Habkost <ehabk...@redhat.com> Date: Fri Oct 3 16:39:51 2014 -0300 target-i386: Disable SVM by default in KVM mode Make SVM be disabled by default on all CPU models when in KVM mode. Nested SVM is enabled by default in the KVM kernel module, but it is probably less stable than nested VMX (which is already disabled by default). Add a new compat function, x86_cpu_compat_kvm_no_autodisable(), to keep compatibility on previous machine-types. Suggested-by: Paolo Bonzini <pbonz...@redhat.com> Signed-off-by: Eduardo Habkost <ehabk...@redhat.com> Signed-off-by: Andreas Färber <afaer...@suse.de> commit 864867b91b48d38e2bfc7b225197901e6f7d8216 Author: Eduardo Habkost <ehabk...@redhat.com> Date: Fri Oct 3 16:39:48 2014 -0300 target-i386: Disable CPUID_ACPI by default in KVM mode KVM never supported the CPUID_ACPI flag, so it doesn't make sense to have it enabled by default when KVM is enabled. The motivation here is exactly the same we had for the MONITOR flag (disabled by commit 136a7e9a85d7047461f8153f7d12c514a3d68f69). And like in the MONITOR flag case, we don't need machine-type compat code because it is currently impossible to run a KVM VM with the ACPI flag set. Signed-off-by: Eduardo Habkost <ehabk...@redhat.com> Signed-off-by: Andreas Färber <afaer...@suse.de> commit 136a7e9a85d7047461f8153f7d12c514a3d68f69 Author: Eduardo Habkost <ehabk...@redhat.com> Date: Wed Apr 30 13:48:28 2014 -0300 target-i386: kvm: Don't enable MONITOR by default on any CPU model KVM never supported the MONITOR flag so it doesn't make sense to have it enabled by default when KVM is enabled. The rationale here is similar to the cases where it makes sense to have a feature enabled by default on all CPU models when on KVM mode (e.g. x2apic). In this case we are having a feature disabled by default for the same reasons. In this case we don't need machine-type compat code because it is currently impossible to run a KVM VM with the MONITOR flag set. Reviewed-by: Paolo Bonzini <pbonz...@redhat.com> Signed-off-by: Eduardo Habkost <ehabk...@redhat.com> Signed-off-by: Andreas Färber <afaer...@suse.de> commit ef02ef5f4536dba090b12360a6c862ef0e57e3bc Author: Eduardo Habkost <ehabk...@redhat.com> Date: Wed Feb 19 11:58:12 2014 -0300 target-i386: Enable x2apic by default on KVM When on KVM mode, enable x2apic by default on all CPU models. Normally we try to keep the CPU model definitions as close as the real CPUs as possible, but x2apic can be emulated by KVM without host CPU support for x2apic, and it improves performance by reducing APIC access overhead. x2apic emulation is available on KVM since 2009 (Linux 2.6.32-rc1), there's no reason for not enabling x2apic by default when running KVM. Signed-off-by: Eduardo Habkost <ehabk...@redhat.com> Acked-by: Michael S. Tsirkin <m...@redhat.com> Signed-off-by: Andreas Färber <afaer...@suse.de> commit 6a4784ce6b95b013a13504ead9ab62975faf6eff Author: Eduardo Habkost <ehabk...@redhat.com> Date: Mon Jan 7 16:20:44 2013 -0200 target-i386: Disable kvm_mmu by default KVM_CAP_PV_MMU capability reporting was removed from the kernel since v2.6.33 (see commit a68a6a7282373), and was completely removed from the kernel since v3.3 (see commit fb92045843). It doesn't make sense to keep it enabled by default, as it would cause unnecessary hassle when using the "enforce" flag. This disables kvm_mmu on all machine-types. With this fix, the possible scenarios when migrating from QEMU <= 1.3 to QEMU 1.4 are: ------------+----------+---------------------------------------------------- src kernel | dst kern.| Result ------------+----------+---------------------------------------------------- >= 2.6.33 | any | kvm_mmu was already disabled and will stay disabled <= 2.6.32 | >= 3.3 | correct live migration is impossible <= 2.6.32 | <= 3.2 | kvm_mmu will be disabled on next guest reboot * ------------+----------+---------------------------------------------------- * If they are running kernel <= 2.6.32 and want kvm_mmu to be kept enabled on guest reboot, they can explicitly add +kvm_mmu to the QEMU command-line. Using 2.6.33 and higher, it is not possible to enable kvm_mmu explicitly anymore. Signed-off-by: Eduardo Habkost <ehabk...@redhat.com> Reviewed-by: Gleb Natapov <g...@redhat.com> Signed-off-by: Andreas Färber <afaer...@suse.de> commit dc59944bc9a5ad784572eea57610de60e4a2f4e5 Author: Michael S. Tsirkin <m...@redhat.com> Date: Thu Oct 18 00:15:48 2012 +0200 qemu: enable PV EOI for qemu 1.3 Enable KVM PV EOI by default. You can still disable it with -kvm_pv_eoi cpu flag. To avoid breaking cross-version migration, enable only for qemu 1.3 (or in the future, newer) machine type. Signed-off-by: Michael S. Tsirkin <m...@redhat.com> commit ef8621b1a3b199c348606c0a11a77d8e8bf135f1 Author: Anthony Liguori <aligu...@us.ibm.com> Date: Wed Aug 29 09:32:41 2012 -0500 target-i386: disable pv eoi to fix migration across QEMU versions We have a problem with how we handle migration with KVM paravirt features. We unconditionally enable paravirt features regardless of whether we know how to migrate them. We also don't tie paravirt features to specific machine types so an old QEMU on a new kernel would expose features that never existed. The 1.2 cycle is over and as things stand, migration is broken. Michael has another series that adds support for migrating PV EOI and attempts to make it work correctly for different machine types. After speaking with Michael on IRC, we agreed to take this patch plus 1 & 4 from his series. This makes sure QEMU can migrate PV EOI if it's enabled, but does not enable it by default. This also means that we won't unconditionally enable new features for guests future proofing us from this happening again in the future. Signed-off-by: Anthony Liguori <aligu...@us.ibm.com> -- Eduardo