On 10/25/2017 08:13 PM, Jason J. Herne wrote: > On 10/20/2017 10:54 AM, Christian Borntraeger wrote: >> Starting a guest with >> <os> >> <type arch='s390x' machine='s390-ccw-virtio-2.9'>hvm</type> >> </os> >> <cpu mode='host-model'/> >> >> on an IBM z14 results in >> >> "qemu-system-s390x: Some features requested in the CPU model are not >> available in the configuration: gs" >> >> This is because guarded storage is fenced for compat machines that did not >> have >> guarded storage support, but libvirt expands the cpu model according to the >> latest available machine. >> >> While this prevents future migration abort (by not starting the guest at >> all), >> not being able to start a "host-model" guest is very much unexpected. As it >> turns out, even if we would modify libvirt to not expand the cpu model to >> contain "gs" for compat machines, it cannot guarantee that a migration will >> succeed. For example if the kernel changes its features (or the user has >> nested=1 on one host but not on the other) the migration will fail >> nevertheless. So instead of fencing "gs" for machines <= 2.9 lets allow it >> for >> all machine types that support the CPU model. This will make "host-model" >> runnable all the time, while relying on the CPU model to reject invalid >> migration attempts. > ... >> - if (gs_allowed()) { >> + if (cpu_model_allowed()) { >> if (kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) == 0) { >> cap_gs = 1;
@Jason Hi Jason, I don't have access to a z14 at the moment, and since you do, I would like to try out something. I will first describe my concern, and then the test scenario. The last line above, cap_gs = 1, has the side effect of returning true ever after. int kvm_s390_get_gs(void) { return cap_gs; } Now considering static bool gscb_needed(void *opaque) { return kvm_s390_get_gs(); } const VMStateDescription vmstate_gscb = { .name = "cpu/gscb", .version_id = 1, .minimum_version_id = 1, .needed = gscb_needed, .fields = (VMStateField[]) { VMSTATE_UINT64_ARRAY(env.gscb, S390CPU, 4), VMSTATE_END_OF_LIST() } }; const VMStateDescription vmstate_s390_cpu = { .name = "cpu", .post_load = cpu_post_load, .pre_save = cpu_pre_save, .version_id = 4, .minimum_version_id = 3, .fields = (VMStateField[]) { VMSTATE_UINT64_ARRAY(env.regs, S390CPU, 16), VMSTATE_UINT64(env.psw.mask, S390CPU), VMSTATE_UINT64(env.psw.addr, S390CPU), VMSTATE_UINT64(env.psa, S390CPU), VMSTATE_UINT32(env.todpr, S390CPU), VMSTATE_UINT64(env.pfault_token, S390CPU), VMSTATE_UINT64(env.pfault_compare, S390CPU), VMSTATE_UINT64(env.pfault_select, S390CPU), VMSTATE_UINT64(env.cputm, S390CPU), VMSTATE_UINT64(env.ckc, S390CPU), VMSTATE_UINT64(env.gbea, S390CPU), VMSTATE_UINT64(env.pp, S390CPU), VMSTATE_UINT32_ARRAY(env.aregs, S390CPU, 16), VMSTATE_UINT64_ARRAY(env.cregs, S390CPU, 16), VMSTATE_UINT8(env.cpu_state, S390CPU), VMSTATE_UINT8(env.sigp_order, S390CPU), VMSTATE_UINT32_V(irqstate_saved_size, S390CPU, 4), VMSTATE_VBUFFER_UINT32(irqstate, S390CPU, 4, NULL, irqstate_saved_size), VMSTATE_END_OF_LIST() }, .subsections = (const VMStateDescription*[]) { &vmstate_fpu, &vmstate_vregs, &vmstate_riccb, &vmstate_exval, &vmstate_gscb, NULL }, }; I would expect the vmstate_gscb subsection being sent, even if gs is disabled via cpu-model if kernel and possibly machine has gs support (and qemu has cpu-models). So the test scenario I want you to play trough is the following. Take the latest-greatest qemu with this patch applied. Make sure gs works (is provided to the guest) with a 2.9 machine version, and a fully specified cpu-model. Now disable gs explicitly. Try to migrate this to another machine having a 2.9 binary. I expect the migration failing because, the subsection is going to be sent by the latest-greatest binary, but is unknown to the 2.9 binary. Notice this is despite the fact that gs is explicitly disabled. Now that I think about it, maybe the 2.9 binary is going to reject the explicit gs flag altogether, because it's unknown. Isn't this a problem? I'm afraid like this the only migration-safe variant is -base, but that would essentially make adding features incrementally impossible. But I hypothesize trying to migrate with z13 or even z13-base would also trigger the unknown subsection problem. Unfortunately I can't test this because my kernel never makes kvm_vm_enable_cap(s, KVM_CAP_S390_GS, 0) return 0, because I lack HW support in the host. Regards, Halil > > Ok, honestly, I dislike this idea because it means we are effectively lying > now. We will happily accept a +gs cpu model with a 2.9 compat machine when > the point of compat machines is to mimick the older version of Qemu which > does not support GS. Yes, model checking will prevent the worst side > effects, namely, exploding migrations. > > I'd far prefer a solution that makes host-model dependent on the machine > type. But I understand some of the backlash on that idea. Still, it seems > like the cleanest approach even if it will be more work. > > With all of that said, I know we want this fixed and your patch seems to fix > the problem. So if you need an R-b... > > Reviewed-by: Jason J. Herne <jjhe...@linux.vnet.ibm.com> > > Can we have a new tag? :-D > Reviewed-by-with-reservations: Jason J. Herne <jjhe...@linux.vnet.ibm.com> >