On Tue, Apr 07, 2026, Sean Christopherson wrote: > +Michael Let's try that again. Email address #1 bounced.
> On Tue, Apr 07, 2026, Vitaly Kuznetsov wrote: > > Thomas Lefebvre <[email protected]> writes: > > > Under Hyper-V, raw RDTSC values are not consistent across vCPUs. > > > The hypervisor corrects them only through the TSC page scale/offset. > > > If pvclock_update_vm_gtod_copy() runs on CPU 0 and __get_kvmclock() > > > later runs on CPU 1 where the raw TSC is lower, the unsigned > > > subtraction wraps. > > > > > > > According to the TLFS, reference TSC page is partition wide: > > > > "The hypervisor provides a partition-wide virtual reference TSC page > > which is overlaid on the partition’s GPA space. A partition’s reference > > time stamp counter page is accessed through the Reference TSC MSR." > > > > so if as you say RAW rdtsc value is inconsistent across vCPUs, I can > > hardly see how we can use this time source at all, even without > > KVM. scale/offset are the same for all vCPUs. > > > > I think the fix here is to avoid setting up Hyper-V TSC page clocksource > > in L1. Unfortunately, with unsynchronized TSCs this will leave us the > > only choice for a sane clocksource: raw HV_X64_MSR_TIME_REF_COUNT MSR > > reads. > > This feels like either a Hyper-V bug or a Linux-as-a-guest bug. For > "Reference > Counter"[1]: > > The hypervisor maintains a per-partition reference time counter. It has the > characteristic that successive accesses to it return strictly monotonically > increasing (time) values as seen by any and all virtual processors of a > partition. Furthermore, the reference counter is rate constant and > unaffected > by processor or bus speed transitions or deep processor power savings > states. A > partition’s reference time counter is initialized to zero when the > partition is > created. The reference counter for all partitions count at the same rate, > but > at any time, their absolute values will typically differ because partitions > will have different creation times. > > The reference counter continues to count up as long as at least one virtual > processor is not explicitly suspended. > > > And then "Partition Reference Time Enlightenment"[2]: > > The partition reference time enlightenment presents a reference time source > to > a partition which does not require an intercept into the hypervisor. This > enlightenment is available only when the underlying platform provides > support > of an invariant processor Time Stamp Counter (TSC), or iTSC. In such > platforms, > the processor TSC frequency remains constant irrespective of changes in the > processor’s clock frequency due to the use of power management states such > as > ACPI processor performance states, processor idle sleep states (ACPI > C-states), > etc. > > The partition reference time enlightenment uses a virtual TSC value, an > offset > and a multiplier to enable a guest partition to compute the normalized > reference time since partition creation, in 100nS units. The mechanism also > allows a guest partition to atomically compute the reference time when the > guest partition is migrated to a platform with a different TSC rate, and > provides a fallback mechanism to support migration to platforms without the > constant rate TSC feature. > > My read of "Partition Reference Time Enlightenment" is that it should only be > advertised if the TSC is synchronized and constant. I can't figure out where > that feature is actually advertised though, because IIUC it's not the same as > HV_ACCESS_TSC_INVARIANT, which says that the virtual TSC is guaranteed to be > invariant even across live migration. And it's not > HV_MSR_REFERENCE_TSC_AVAILABLE, > because I'm pretty sure that just says HV_MSR_REFERENCE_TSC is available. > > Michael, help? > > [1] > https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/timers#reference-counter > [2] > https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/timers#partition-reference-time-enlightenment

