> On 28-Aug-2018, at 10:57 PM, Dr. David Alan Gilbert <dgilb...@redhat.com> > wrote: > > External Email > > (Cc'ing in Eric, Drew, and Peter for ARM stuff) > Thanks, > * Jaggi, Manish (manish.ja...@cavium.com) wrote: >> >> >>> On 23-Aug-2018, at 7:59 PM, Juan Quintela <quint...@redhat.com> wrote: >>> >>> External Email >>> >>> "Jaggi, Manish" <manish.ja...@cavium.com> wrote: >>>> Hi, >>> >>> Hi >>> >>> [Note that I was confused about what do you mean with problems with >>> processorID. There is no processorID on the migration stream, so I >>> didn't understood what you were talking about. Until I realized that >>> you were trying to migrate from different cpu types] >>> >>>> Posting again with my cavium ID and CCing relevant folks >>> >>> It will be good to give What architecture are we talking about? MIPS, >>> ARM, anything else? >>> >> arm64 >> >>> Why? Because we do this continously on x86_64 world. How do we do >>> this? We emulate the _processor_ capabilities, so "in general" you can >>> always migrate from a processor to another with a superset of the >>> features. If you look at the ouput of: >>> >>> qemu-system-x86_64 -cpu ? >>> >>> You can see that we have lots of cpu types that we emulate and cpuid >>> (features really). Migration intel<->amd is tricky. But from "intel >>> with less features" to "intel with more features" (or the same with AMD) >>> it is a common thing to do. Once told that, it is a lot of work, simple >>> things like that processors run at different clock speeds imply that you >>> need to be careful during migration with timers and anything that >>> depends on frequencies. >>> >>> I don't know enough about other architectures to know how to do it, or >>> how feasible is. >> >> For arm64 qemu/kvm throws an error when processorID does not match. >>> >>>> Live Migration between machines with different processorIds >>>> >>>> VM Migration between machines with different processorId values throws >>>> error in qemu/kvm. Though this check is appropriate but is overkill where >>>> two machines are of same SoC/arch family and have same core/gic but >>>> delta could be in other parts of Soc which have no effect on VM >>>> operation. >>> >>> Then you need to do the whole process of: >>> >>> Lets call both processors A1 and A2. You need to do the whole process >>> of: >>> >>> a- defining cpu A1 >>> b- make sure that when you run qemu/kvm on processor A2, the >>> features/behaviours that the guest sees. This is not trivial at >>> all. >>> c- when migration comes, you can see that you need to adjust to whatever >>> is the architecture of the destination. >>> >>>> There could be two ways to address this issue by ignoring the >>>> comparison of processorIDs and so need feedback from the >>>> community on this. >>>> >>>> a) Maintain a whitelist in qemu: >>>> >>>> This will be a set of all processorIds which are compatible and migration >>>> can >>>> happen between any of the machines with the Ids from this set. This set can >>>> be statically built within qemu binary. >>> >>> In general, I preffer whitelists over blacklists. >>> >>>> b) Provide an extra option with migrate command >>>> >>>> migrate tcp:<ip>:<port>:<dest_processor_id> >>>> >>>> This is to fake the src_processor_id as dest_processor_id, so the qemu >>>> running >>>> on destination machine will not complain. The overhead with this approach >>>> is >>>> that the destination machines Id need to be known beforehand. >>> >>> Please, don't even think about this: >>> a- migration commands are architecture agnostic >>> b- in general it is _much_, _much_ easier to fix things on destination >>> that on source. >>> >>>> If there is some better way… please suggest. >>> >>> Look at how it is done on x86_64. But be aware that "doing it right" >>> takes a lot of work. To give you one idea: >>> - upstream, i.e. qemu, "warantee" that migration of: >>> qemu-X -M machine-type-X -> qemu-Y -M machine-type-X >>> works when X < Y. >>> >>> - downstream (i.e. redhat on my case, but I am sure that others also >>> "suffer" this) allow also: >>> >>> qemu-Y -M machine-type-X -> qemu-X -M machine-type-X (Y > X) >>> >>> in general it is a very complicated problem, so we limit _what_ you >>> can do. Basically we only support our machine-types, do a lot of >>> testing, and are very careful when we add new features. I.e. be >>> preparred to do a lot of testing and a lot of fixing. >> >> At this point I am targeting a simpler case where Machine A1 and A2 has a >> core from the same SoC family. >> For example Cavium ThunderX2 Core incremental versions which has identical >> core and GIC and may have some errata fixes. >> In that case Y=X since migration only takes care of PV devices. >> >> In that case a whitelist could be an easier option? >> >> How to provide the whitelist to qemu in a platform agnostic way? >> - I will look into intel model as you have suggested, does intel keeps a >> whitelist or masks off some bits of processorID >> How does intel does it > > Purely based on features rather than IDs. > > If it's an Intel processor and it's got that set of CPU features > migration to it will normally work. > (There are some gotcha's that we hit from time to time, but > the basic idea holds) >
Just to add what happens in ARM64 case, qemu running on Machine A sends cpu state information to Machine B. This state contains MIDR value, and so Processor ID value is compared in KVM and not in qemu (correcting myself). IIRC, Peter/Eric please point if there is something incorrect in the below flow... (Machine B) target/arm/machine.c: cpu_post_load() - updates cpu->cpreg_values[i] : which includes MIDR (processor ID register) - calls write_list_to_kvmstate(cpu, KVM_PUT_FULL_STATE) target/arm/kvm.c: write_list_to_kvmstate - calls => kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &r); => and it eventually lands up IIRC in Linux code in => arch/arm64/kvm/sys_regs.c : set_invariant_sys_reg(u64 id, void __user *uaddr) /* This is what we mean by invariant: you can't change it. */ if (r->val != val) return -EINVAL; Note: MIDR_EL1 is invariant register. result: Migration fails on Machine B. A few points: - qemu on arm64 is invoked with -machine virt and -cpu as host. So we don't explicitly define which cpu. - In case Machine A and Machine B have almost same Core and the delta may-not have any effect on qemu operation, migration should work by just looking into whitelist. whitelist can be given as a parameter for qemu on machine B. qemu-system-aarch64 -whitelist <ids separated by commas> (This is my proposal) - So in cpu_post_load (Machine B) qemu can lookup whitelist and replace the MIDR with the one at Machine B. Sounds good? - Juan raised a point about clock speed, I am not sure it will have any effect on arm since qemu is run with -cpu host param. I could be wrong here, Peter/Eric can you please correct me... -Thanks Manish > Dave >> - is providing a -mirate-compat-whitelist <file> option for arm only looks >> good? >> this option can be added in A1/A2 qemu command, so it would be upstream / >> downstream agnostic. > >>> >>> I am sorry to not be able to tell you that this is an easy problem. >>> >>> Later, Juan. >> > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK