On Wed, Jul 16, 2014 at 12:38:51PM +0400, Andrey Korolyov wrote: > On Wed, Jul 16, 2014 at 5:16 AM, Marcelo Tosatti <mtosa...@redhat.com> wrote: > > On Wed, Jul 16, 2014 at 03:40:47AM +0400, Andrey Korolyov wrote: > >> On Wed, Jul 16, 2014 at 2:01 AM, Paolo Bonzini <pbonz...@redhat.com> wrote: > >> > Il 15/07/2014 23:25, Andrey Korolyov ha scritto: > >> > > >> >> On Wed, Jul 16, 2014 at 1:09 AM, Marcelo Tosatti <mtosa...@redhat.com> > >> >> wrote: > >> >>> > >> >>> On Tue, Jul 15, 2014 at 06:01:08PM +0400, Andrey Korolyov wrote: > >> >>>> > >> >>>> On Tue, Jul 15, 2014 at 10:52 AM, Andrey Korolyov <and...@xdel.ru> > >> >>>> wrote: > >> >>>>> > >> >>>>> On Tue, Jul 15, 2014 at 9:03 AM, Amit Shah <amit.s...@redhat.com> > >> >>>>> wrote: > >> >>>>>> > >> >>>>>> On (Sun) 13 Jul 2014 [16:28:56], Andrey Korolyov wrote: > >> >>>>>>> > >> >>>>>>> Hello, > >> >>>>>>> > >> >>>>>>> the issue is not specific to the iothread code because generic > >> >>>>>>> virtio-blk also hangs up: > >> >>>>>> > >> >>>>>> > >> >>>>>> Do you know which version works well? If you could bisect, that'll > >> >>>>>> help a lot. > >> >>>>>> > >> >>>>>> Thanks, > >> >>>>>> Amit > >> >>>>> > >> >>>>> > >> >>>>> Hi, > >> >>>>> > >> >>>>> 2.0 works definitely well. I`ll try to finish bisection today, though > >> >>>>> every step takes about 10 minutes to complete. > >> >>>> > >> >>>> > >> >>>> Yay! It is even outside of virtio-blk. > >> >>>> > >> >>>> commit 9b1786829aefb83f37a8f3135e3ea91c56001b56 > >> >>>> Author: Marcelo Tosatti <mtosa...@redhat.com> > >> >>>> Date: Tue Jun 3 13:34:48 2014 -0300 > >> >>>> > >> >>>> kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec > >> >>>> calculation > >> >>>> > >> >>>> Ensure proper env->tsc value for kvmclock_current_nsec > >> >>>> calculation. > >> >>>> > >> >>>> Reported-by: Marcin Gibuła <m.gib...@beyond.pl> > >> >>>> Cc: qemu-sta...@nongnu.org > >> >>>> Signed-off-by: Marcelo Tosatti <mtosa...@redhat.com> > >> >>>> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> > >> >>> > >> >>> > >> >>> Andrey, > >> >>> > >> >>> Can you please provide instructions on how to create reproducible > >> >>> environment? > >> >>> > >> >>> The following patch is equivalent to the original patch, for > >> >>> the purposes of fixing the kvmclock problem. > >> >>> > >> >>> Perhaps it becomes easier to spot the reason for the hang you are > >> >>> experiencing. > >> >>> > >> >>> > >> >>> diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c > >> >>> index 272a88a..feb5fc5 100644 > >> >>> --- a/hw/i386/kvm/clock.c > >> >>> +++ b/hw/i386/kvm/clock.c > >> >>> @@ -17,7 +17,6 @@ > >> >>> #include "qemu/host-utils.h" > >> >>> #include "sysemu/sysemu.h" > >> >>> #include "sysemu/kvm.h" > >> >>> -#include "sysemu/cpus.h" > >> >>> #include "hw/sysbus.h" > >> >>> #include "hw/kvm/clock.h" > >> >>> > >> >>> @@ -66,7 +65,6 @@ static uint64_t kvmclock_current_nsec(KVMClockState > >> >>> *s) > >> >>> > >> >>> cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time)); > >> >>> > >> >>> - assert(time.tsc_timestamp <= migration_tsc); > >> >>> delta = migration_tsc - time.tsc_timestamp; > >> >>> if (time.tsc_shift < 0) { > >> >>> delta >>= -time.tsc_shift; > >> >>> @@ -125,8 +123,6 @@ static void kvmclock_vm_state_change(void *opaque, > >> >>> int running, > >> >>> if (s->clock_valid) { > >> >>> return; > >> >>> } > >> >>> - > >> >>> - cpu_synchronize_all_states(); > >> >>> ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data); > >> >>> if (ret < 0) { > >> >>> fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", > >> >>> strerror(ret)); > >> >>> diff --git a/migration.c b/migration.c > >> >>> index 8d675b3..34f2325 100644 > >> >>> --- a/migration.c > >> >>> +++ b/migration.c > >> >>> @@ -608,6 +608,7 @@ static void *migration_thread(void *opaque) > >> >>> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER); > >> >>> old_vm_running = runstate_is_running(); > >> >>> > >> >>> + cpu_synchronize_all_states(); > >> >>> ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); > >> >>> if (ret >= 0) { > >> >>> qemu_file_set_rate_limit(s->file, INT64_MAX); > >> > > >> > > >> > It could also be useful to apply the above patch _and_ revert > >> > a096b3a6732f846ec57dc28b47ee9435aa0609bf, then try to reproduce. > >> > > >> > Paolo > >> > >> Yes, it solved the issue for me! (it took much time to check because > >> most of country` debian mirrors went inconsistent by some reason) > >> > >> Also trivial addition: > >> > >> diff --git a/migration.c b/migration.c > >> index 34f2325..65d1c88 100644 > >> --- a/migration.c > >> +++ b/migration.c > >> @@ -25,6 +25,7 @@ > >> #include "qemu/thread.h" > >> #include "qmp-commands.h" > >> #include "trace.h" > >> +#include "sysemu/cpus.h" > > > > And what about not reverting a096b3a6732f846ec57dc28b47ee9435aa0609bf ? > > > > That is, test with a stock qemu.git tree and the patch sent today, > > on this thread, to move cpu_synchronize_all_states ? > > > > > > The main reason for things to work for me is a revert of > 9b1786829aefb83f37a8f3135e3ea91c56001b56 on top, not adding any other > patches. I had tested two cases, with Alexander`s patch completely > reverted plus suggestion from Marcelo and only with revert 9b178682 > plug same suggestion. The difference is that the until Alexander` > patch is not reverted, live migration is always failing by the timeout > value, and when reverted migration always succeeds in 8-10 seconds. > Appropriate diffs are attached for the reference.
Andrey, Can you please apply only the following attached patch to an upstream QEMU git tree (move_synchronize_all_states.patch), plus the necessary header file corrections, and attempt to reproduce? When you reproduce, please provide a backtrace and version of the QEMU git tree, and instructions on how to reproduce: 1) full QEMU command line 2) steps to reproduce
Move cpu_synchronize_all_states to migration.c. diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c index 272a88a..feb5fc5 100644 --- a/hw/i386/kvm/clock.c +++ b/hw/i386/kvm/clock.c @@ -17,7 +17,6 @@ #include "qemu/host-utils.h" #include "sysemu/sysemu.h" #include "sysemu/kvm.h" -#include "sysemu/cpus.h" #include "hw/sysbus.h" #include "hw/kvm/clock.h" @@ -66,7 +65,6 @@ static uint64_t kvmclock_current_nsec(KVMClockState *s) cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time)); - assert(time.tsc_timestamp <= migration_tsc); delta = migration_tsc - time.tsc_timestamp; if (time.tsc_shift < 0) { delta >>= -time.tsc_shift; @@ -125,8 +123,6 @@ static void kvmclock_vm_state_change(void *opaque, int running, if (s->clock_valid) { return; } - - cpu_synchronize_all_states(); ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data); if (ret < 0) { fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(ret)); diff --git a/migration.c b/migration.c index 8d675b3..34f2325 100644 --- a/migration.c +++ b/migration.c @@ -608,6 +608,7 @@ static void *migration_thread(void *opaque) qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER); old_vm_running = runstate_is_running(); + cpu_synchronize_all_states(); ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); if (ret >= 0) { qemu_file_set_rate_limit(s->file, INT64_MAX);