From: Fernando Luis Vázquez Cao <fernando...@lab.ntt.co.jp> VCPU TSC is not cleared by a warm reset (*), which leaves some types of Linux guests (non-pvops guests and those with the kernel parameter no-kvmclock set) vulnerable to the overflow in cyc2ns_offset fixed by upstream commit 9993bc635d01a6ee7f6b833b4ee65ce7c06350b1 ("sched/x86: Fix overflow in cyc2ns_offset").
To put it in a nutshell, if such a Linux guest without the patch above applied has been up more than 208 days and attempts a warm reset chances are that the newly booted kernel will panic or hang. (*) Intel Xeon E5 processors show the same broken behavior due to the errata "TSC is Not Affected by Warm Reset" (Intel® Xeon® Processor E5 Family Specification Update - August 2013): "The TSC (Time Stamp Counter MSR 10H) should be cleared on reset. Due to this erratum the TSC is not affected by warm reset." Cc: Will Auld <will.a...@intel.com> Cc: Marcelo Tosatti <mtosa...@redhat.com> Signed-off-by: Fernando Luis Vazquez Cao <ferna...@oss.ntt.co.jp> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> Signed-off-by: Fernando Luis Vázquez Cao <fernando...@lab.ntt.co.jp> --- target-i386/cpu.c | 3 +++ target-i386/kvm.c | 4 +--- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 5076a94..bc4cb9d 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -2450,6 +2450,9 @@ static void x86_cpu_reset(CPUState *s) cpu_breakpoint_remove_all(env, BP_CPU); cpu_watchpoint_remove_all(env, BP_CPU); + env->tsc_adjust = 0; + env->tsc = 0; + #if !defined(CONFIG_USER_ONLY) /* We hard-wire the BSP to the first CPU. */ if (s->cpu_index == 0) { diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 312a46b..285e1a3 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -1150,14 +1150,12 @@ static int kvm_put_msrs(X86CPU *cpu, int level) kvm_msr_entry_set(&msrs[n++], MSR_LSTAR, env->lstar); } #endif - if (level == KVM_PUT_FULL_STATE) { - kvm_msr_entry_set(&msrs[n++], MSR_IA32_TSC, env->tsc); - } /* * The following MSRs have side effects on the guest or are too heavy * for normal writeback. Limit them to reset or full state updates. */ if (level >= KVM_PUT_RESET_STATE) { + kvm_msr_entry_set(&msrs[n++], MSR_IA32_TSC, env->tsc); kvm_msr_entry_set(&msrs[n++], MSR_KVM_SYSTEM_TIME, env->system_time_msr); kvm_msr_entry_set(&msrs[n++], MSR_KVM_WALL_CLOCK, env->wall_clock_msr); -- 1.8.3.1