Component: Xen Hypervisor (x86 / time.c)
Versions affected: Potential in 4.17-4.21 and unstable (tested on 4.18 with high vCPU density)
Description:
In high-load scenarios (24+ cores, heavy Dom0 load, and frequent VM pauses via DRAKVUF/VMI), Windows guests experience Desktop Window Manager (DWM.exe) crashes with error 0x8898009b. The root cause is an integer memory overflow in the time scaling logic, in case if the time calibration occurs simultaneously with a snapshot reversion or RDTSC(P) instruction emulation.
Technical Analysis:
The get_s_time_fixed function in (xen/arch/x86/time.c) accepts at_tsc as an argument. If it is less than local_tsc, a negative delta will be produced, which will be incorrectly handled in scale_delta (Or, if at_tsc is zero, a race condition may occur after receiving ticks via rdtsc_ordered, time calibration will occur, and local_tsc may become larger than the tick values). This will result in an extremely large number instead of a backward offset. This is guaranteed to be reproducible in hvm_load_cpu_ctxt (xen/arch/x86/hvm/hvm.c), as sync_tsc will be less than local_tsc after time calibration. This can also potentially occur during RDTSC(P) emulation simultaneously with time_calibration_rendezvous_tail (xen/arch/x86/time.c). Windows DWM, sensitive to QueryPerformanceCounter jumps, fails catastrophically when it receives an essentially infinite timestamp delta.

Steps to Reproduce:

      Setup a host with a high core count (e.g., 24+ cores).

      Run a high density of Windows 10 DomUs (20 domains with 4 vcpus each).

      Apply heavy load on Dom0 (e.g., DRAKVUF monitoring).

      Frequently pause/resume or revert snapshots of the DomUs.

      Observe dwm.exe crashes in Guests with MILERR_QPC_TIME_WENT_BACKWARD (0x8898009b).

Currently, the lack of sign-awareness in the delta scaling path allows a nanosecond-scale race condition to turn into a multi-millennium time jump.

Environment:

      CPU: 24 cores (Intel Xeon with Invariant TSC)

      Dom0: High vCPU count (24)

      Feature: tsc_mode="always_emulate", *timer_mode="**no_delay_for_missed_ticks**"*

      Guest: Windows 10/11 with tsc as time source

Reply via email to