On 26.09.2017 14:45, Christian Borntraeger wrote: > > > On 09/26/2017 02:26 PM, David Hildenbrand wrote: >> On 22.09.2017 10:38, Christian Borntraeger wrote: >>> With newer kernels that do support the ais feature (4.13) a qemu 2.11 >>> will not only enable the ais feature for the 2.11 machine, but also >>> for a <=2.10 compat machine. As this feature is not available in >>> QEMU <=2.9 (and QEMU 2.10.1), this guest will fail to migrate >>> back to an older qemu like 2.9 with: >>> >>> _snip_ >>> error while loading state for instance 0x0 of device 's390-flic' >>> _snip_ >>> >>> making the whole compat machine dis-functional. As a permanent fix, we >>> need to fence the ais feature for machines <= 2.10 >>> >>> Due to ais being enabled on 2.10.0 (fixed in 2.10.1) this will prevent >>> migration of ais-enabled guests from 2.10.0 with >>> >>> _snip_ >>> qemu-system-s390x: Failed to load s390-flic/ais:tmp >>> qemu-system-s390x: error while loading state for instance 0x0 of device >>> 's390-flic' >>> qemu-system-s390x: load of migration failed: Function not implemented >>> _snip_ >>> >>> Signed-off-by: Christian Borntraeger <borntrae...@de.ibm.com> >>> Cc: Yi Min Zhao <zyi...@linux.vnet.ibm.com> >>> Cc: Dr. David Alan Gilbert <dgilb...@redhat.com> >>> --- >>> hw/intc/s390_flic_kvm.c | 4 +++- >>> hw/s390x/s390-virtio-ccw.c | 10 ++++++++++ >>> include/hw/s390x/s390-virtio-ccw.h | 3 +++ >>> 3 files changed, 16 insertions(+), 1 deletion(-) >>> >>> diff --git a/hw/intc/s390_flic_kvm.c b/hw/intc/s390_flic_kvm.c >>> index a655567..2a94bfc 100644 >>> --- a/hw/intc/s390_flic_kvm.c >>> +++ b/hw/intc/s390_flic_kvm.c >>> @@ -22,6 +22,7 @@ >>> #include "hw/s390x/s390_flic.h" >>> #include "hw/s390x/adapter.h" >>> #include "hw/s390x/css.h" >>> +#include "hw/s390x/s390-virtio-ccw.h" >>> #include "trace.h" >>> >>> #define FLIC_SAVE_INITIAL_SIZE getpagesize() >>> @@ -559,7 +560,8 @@ static void kvm_s390_flic_realize(DeviceState *dev, >>> Error **errp) >>> KVM_HAS_DEVICE_ATTR, >>> test_attr); >>> /* try enable the AIS facility */ >>> test_attr.group = KVM_DEV_FLIC_AISM_ALL; >>> - if (!ioctl(flic_state->fd, KVM_HAS_DEVICE_ATTR, test_attr)) { >>> + if (ais_allowed() && >>> + !ioctl(flic_state->fd, KVM_HAS_DEVICE_ATTR, test_attr)) { >>> kvm_vm_enable_cap(kvm_state, KVM_CAP_S390_AIS, 0); >>> } >>> >> >> Wondering if this is really necessary. Shouldn't the CPU model feature >> make sure that migration works? > > older QEMUs complain like when migrating a 2.9 machine to 2.9 > > qemu-system-s390x: Failed to load s390-flic/ais:tmp > qemu-system-s390x: error while loading state for instance 0x0 of device > 's390-flic' > qemu-system-s390x: load of migration failed: Function not implemented > > e.g. when using the host model. >
Wonder when we can finally let go of these hacks. The host cpu model is not migration safe, therefore such things are expected to not work. Just think about trying to migrate from a kernel without AIS support to a kernel with AIS support. It is broken. AIS is just one feature that actually tells you that you are currently doing something evil. Other CPU features you lose on the way simply don't result in an error, but still migration could break silently afterwards, when the guest assumes it has certain CPU features. The main problem is that we have machines <= 2.7 that had no CPU model support. For these machines, migration should work just fine, as s390_has_feat(S390_FEAT_ADAPTER_INT_SUPPRESSION) will always return false. However, the guest will be presented the AIS bit, as soon as the capability is enabled (as we then don't manually set the stfle bitmap from QEMU), which is bad. So my point would be: don't turn on these new facilities if the cpu model is not allowed (<=2.7), but don't add ais_allowed() compat handling for any newer machines. If they use the host model, they have to assume that migration can break. I think using cpu_model_allowed() would be just fine to be used instead of ais_allowed(). -- Thanks, David