On Fri, Aug 14, 2015 at 09:18:01AM +0000, Li, Liang Z wrote: > > Subject: Re: [Qemu-devel] about the patch kvmclock Ensure proper env->tsc > > value for kvmclock_current_nsec calculation > > > > > Thanks for your reply, I have read the thread in your email, what's > > > the > > mean of 'switching from old to new disk', could give a detail description? > > > > The test case was like that (using libvirt): > > > > 1. Get VM running (linux, using kvmclock), 2. Use blockcopy to copy disk > > data > > from one location to another, 3. Issue blockjob --pivot (to finish > > mirroring) > > > > From what I remember, at point 3, VM is momentarily paused and resumed, > > so kvm state change handler is called twice. Without this patch, the VM > > hanged because its time goes backwards (or qemu crashed if assertion was > > not compiled out). > > > > -- > > mg > > So, the problem is cause by stop_vm(RUN_STATE_PAUSED), in this case the > env->tsc is not updated, which lead to the issue. > Is that right? If the cpu_clean_all_dirty() is needed just for the APIC > status reason, I think we can do the cpu_synchronize_all_states() in > do_vm_stop > and after vm_state_notify() when the RUN_STATE_PAUSED is hit, at this point > all the device models is stopped, there is no outdated APIC status. > > I want to write a patch to fix this issue in another way, could help to > verify it in you environment, very appreciate if you could. > > Thanks. > > Liang
Liang, The problem is this: Think two hosts, source and destination. The code is running in the destination host. The clock which the source host maintains (CLOCK_MONOTONIC), which is what KVM_GET_CLOCK ioctl uses, is corrected by NTP. The clock which the guest uses can be based on TSC, which can drift relative to UTC. So at the moment the guest is stopped at the source, KVM_GET_CLOCK clock (s->clock in kvmclock.c) might be behind of TSC based clock (which means that setting that clock value at the destination would case the guest to see time going backwards which is fatal). What the code does is to, on the destination host, read the clock value from memory rather than to use what has been retrieved in the source host with KVM_GET_CLOCK. For that, it needs uptodated env->tsc value relative to the destination host (TSC+TSC_OFFSET of destination host). Yes i can test your patch.