This is v5 of the series to clean up the KVM clock, rebased onto
tip/timers/ptp (which now includes Thomas's ktime snapshot series and
the read_snapshot patches for hyperv, kvmclock, and vmclock).

The KVM clock has historically suffered from three problems:

 1. Imprecision: get_kvmclock_ns() computed the clock from the *host*
    TSC without applying guest TSC scaling, causing systemic drift from
    the values the guest computes from its own TSC.

 2. Unnecessary discontinuities: gratuitous KVM_REQ_MASTERCLOCK_UPDATE
    requests caused the master clock reference point to be re-snapshotted,
    yanking the guest's clock due to arithmetic precision differences.

 3. No precise migration API: the existing KVM_[GS]ET_CLOCK only allows
    setting the clock at a given UTC reference time, which is necessarily
    imprecise. There was no way to preserve the exact arithmetic
    relationship between guest TSC and KVM clock across live migration.

This series addresses all three, and adds new APIs for precise clock 
migration and TSC frequency reporting. As an added bonus, it now rips 
out the whole pvclock_gtod_data hack which was shadowing the kernel's 
timekeeping, and uses ktime snapshots as $DEITY (well, Thomas) intended.

Changes since v4:
 - Rebased onto tip/timers/ptp (includes ktime snapshot infrastructure)
 - Dropped "WARN if kvm_get_walltime_and_clockread() fails" — the WARN
   was spurious during clocksource transitions
 - Dropped guest-side "Obtain TSC frequency from CPUID" patches (adopted
   by Sean for a separate series)
 - Dropped KVM_VCPU_TSC_EFFECTIVE_FREQ
 - Fixed false re-enabling of master clock when a single vCPU syncs
   multiple times at a mismatched frequency: introduced per-vCPU
   cur_tsc_freq_generation counter so each vCPU is counted exactly once
 - Unified nr_vcpus_matched_tsc and nr_vcpus_matched_freq to use the
   same counting convention (1-based, >= online_vcpus threshold)
 - "Avoid gratuitous global clock updates": kept global update in
   non-master-clock mode on vCPU load (CLOCK_MONOTONIC_RAW means no NTP
   drift but preserving the existing safety); only optimize master clock
 - "Xen runstate negative time": refined to update state but not account
   time on backwards clock, always update last_steal and guest shared page
 - Added "Activate master clock immediately on vCPU creation" to avoid
   unnecessary non-master-clock window during VM setup
 - New final patches: use ktime_get_snapshot_id() for master clock
   reference, then remove pvclock_gtod_data entirely (replaced by direct
   ktime_get_raw() + offs_boot computation)
 - Added masterclock_offset_test selftest (verifies kvmclock consistency
   across vCPUs with different TSC offsets)
 - Added xen_cpuid_timing_test selftest
 - Added pvclock_migration_test selftest
 - Addressed AI reviewer (Sashiko) feedback throughout:
   - get_kvmclock(): goto fallback on clock read failure instead of
     using uninitialized data; single #ifdef CONFIG_X86_64 block
   - kvm_synchronize_tsc(): changed ns to s64 to match function
     signature; moved time reads inside tsc_write_lock
   - Kill last_tsc fields: use kvm_scale_tsc() subtraction for
     backwards TSC instead of zeroing cur_tsc_write
   - KVM_[GS]ET_CLOCK_GUEST: validate padding fields, bounds-check
     tsc_shift
   - pvclock selftest: seqcount loop for torn-read safety, per-vCPU
     pvclock addresses, graceful skip when caps unavailable
   - KVM_VCPU_TSC_SCALE: return -ENXIO when !has_tsc_control
   - UAPI pvclock-abi: added -D__KERNEL__ to xen-hypercalls.sh
   - VMX: also clear SECONDARY_EXEC_TSC_SCALING from vmcs_config

David Woodhouse (31):
  KVM: x86/xen: Do not corrupt KVM clock in kvm_xen_shared_info_init()
  KVM: x86: Improve accuracy of KVM clock when TSC scaling is in force
  KVM: x86: Explicitly disable TSC scaling without CONSTANT_TSC
  KVM: x86: Activate master clock immediately on vCPU creation
  KVM: x86: Add KVM_VCPU_TSC_SCALE and fix the documentation on TSC migration
  KVM: x86: Avoid NTP frequency skew for KVM clock on 32-bit host
  KVM: x86: Fold __get_kvmclock() into get_kvmclock()
  KVM: x86: Restructure get_kvmclock()
  KVM: x86: Fix KVM clock precision in get_kvmclock() with TSC scaling
  KVM: x86: Use get_kvmclock() in kvm_get_wall_clock_epoch()
  KVM: x86: Fix compute_guest_tsc() to handle negative time deltas
  KVM: x86: Restructure kvm_guest_time_update() for TSC upscaling
  KVM: x86: Simplify and comment kvm_get_time_scale()
  KVM: x86: Remove implicit rdtsc() from kvm_compute_l1_tsc_offset()
  KVM: x86: Improve synchronization in kvm_synchronize_tsc()
  KVM: x86: Kill last_tsc_{nsec,write,offset} fields
  KVM: x86: Replace nr_vcpus_matched_tsc count with all_vcpus_matched_tsc bool
  KVM: x86: Allow KVM master clock mode when TSCs are offset from each other
  KVM: x86: Factor out kvm_use_master_clock()
  KVM: x86: Avoid gratuitous global clock updates
  KVM: x86/xen: Prevent runstate times from becoming negative
  KVM: x86: Avoid redundant masterclock updates from multiple vCPUs
  KVM: x86: Remove runtime Xen TSC frequency CPUID update
  KVM: x86: Re-synchronize TSC after KVM_SET_TSC_KHZ
  KVM: x86: Use ktime_get_snapshot_id() for master clock
  KVM: x86: Compute kvmclock base without pvclock_gtod_data
  KVM: x86: Replace pvclock_gtod_data vclock_mode with boolean
  KVM: x86: Remove pvclock_gtod_data and private timekeeping code
  KVM: selftests: Add master clock offset test
  KVM: selftests: Add Xen/generic CPUID timing leaf test
  KVM: selftests: Add Xen runstate migration test

Jack Allister (3):
  UAPI: x86: Move pvclock-abi to UAPI for x86 platforms
  KVM: x86: Add KVM_[GS]ET_CLOCK_GUEST for accurate KVM clock migration
  KVM: selftests: Add KVM/PV clock selftest to prove timer correction

 Documentation/virt/kvm/api.rst                     |   37 +
 Documentation/virt/kvm/devices/vcpu.rst            |  119 ++-
 MAINTAINERS                                        |    4 +-
 arch/x86/include/asm/kvm_host.h                    |   16 +-
 arch/x86/include/uapi/asm/kvm.h                    |    6 +
 arch/x86/include/{ => uapi}/asm/pvclock-abi.h      |   27 +-
 arch/x86/kvm/cpuid.c                               |   16 -
 arch/x86/kvm/svm/svm.c                             |    3 +-
 arch/x86/kvm/vmx/vmx.c                             |    4 +-
 arch/x86/kvm/x86.c                                 | 1039 ++++++++++++--------
 arch/x86/kvm/xen.c                                 |   30 +-
 arch/x86/kvm/xen.h                                 |   13 -
 include/uapi/linux/kvm.h                           |    3 +
 scripts/xen-hypercalls.sh                          |    2 +-
 tools/testing/selftests/kvm/Makefile.kvm           |    4 +
 .../selftests/kvm/x86/masterclock_offset_test.c    |  180 ++++
 .../selftests/kvm/x86/pvclock_migration_test.c     |  382 +++++++
 tools/testing/selftests/kvm/x86/pvclock_test.c     |  441 +++++++++
 .../selftests/kvm/x86/xen_cpuid_timing_test.c      |  230 +++++
 .../testing/selftests/kvm/x86/xen_migration_test.c |  194 ++++
 20 files changed, 2263 insertions(+), 487 deletions(-)

base-commit: bc484a5096732cd858771cccd3164ec985bdc03d


Reply via email to