On Tue, May 19, 2026, David Woodhouse wrote: > On Fri, 2026-05-15 at 12:19 -0700, Sean Christopherson wrote: > > Dave/Thomas/Peter/Boris, what's the going rate for bribes to take something > > like this through the tip tree? > > > > The bulk of the changes are in kvmclock and TSC, but pretty much every > > hypervisor's guest-side code gets touched at some point. I am reaonsably > > confident in the correctness of the KVM changes. Michael tested Hyper-V in > > v2, and while there were conflicts when rebasing, they were largely > > superficial (and I've just jinxed myself). For all other hypervisors, > > assume > > the code is compile-tested only, but those changes are all quite small and > > straightforward. > > > > The only changes that are questionable/contentious are the last two patches, > > which have KVM-as-a-guest use CPUID 0x16 to get the CPU frequency, even on > > AMD (that's the dubious part). I very deliberately put them last, so that > > they can be dropped at will (I don't care terribly if those patches land). > > To merge them, I would want explicit Acks from Paolo and David W. > > > > So, except for the last two patches, to get the stuff I really care about > > landed, I think/hope it's just the TSC and guest-side CoCo changes that need > > reviews/acks? > > > > The primary goal of this series is (or at least was, when I started) to > > fix flaws with SNP and TDX guests where a PV clock provided by the untrusted > > hypervisor is used instead of the secure/trusted TSC that is controlled by > > trusted firmware. > > > > The secondary goal is to draft off of the SNP and TDX changes to slightly > > modernize running under KVM. Currently, KVM guests will use TSC for > > clocksource, but not sched_clock. And they ignore Intel's CPUID-based TSC > > and CPU frequency enumeration, even when using the TSC instead of kvmclock. > > And if the host provides the core crystal frequency in CPUID.0x15, then KVM > > guests can use that for the APIC timer period instead of manually > > calibrating > > the frequency. > > > > The tertiary goal is to clean up all of the PV clock code to deduplicate > > logic > > across hypervisors, and to hopefully make it all easier to maintain going > > forward. > > I booted this in qemu with -cpu host,+invtsc,+vmware-cpuid-freq > > I was expecting to see it eschew the kvmclock and use *only* the TSC. > Is there even any need for 'tsc-early' given that it's *told* the TSC > frequency in CPUID? Shouldn't it have detected that the TSC is known > before init_tsc_clocksource() runs? > > And then it even spent some time at boot actually using the kvmclock as > clocksource... when ideally I don't think it would even have *enabled* > it at all?
Yeah, that's definitely the ideal state. And I had all the same expectations and observations as you when digging in and testing this. But unless this series makes things worse, I want punt on achieving the ideal state for the moment, as it's proving to be a big lift just to get to a not-awful state. > [ 0.000000] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: > 0x1cd42e4dffb, max_idle_ns: 881590591483 ns > [ 0.000000] tsc: Detected 2400.000 MHz processor > [ 0.008205] TSC deadline timer available > [ 0.008270] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: > 0xffffffff, max_idle_ns: 1910969940391419 ns > [ 0.159085] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, > max_idle_ns: 19112604467 ns > [ 0.164074] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: > 0x22983777dd9, max_idle_ns: 440795300422 ns > [ 0.229087] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, > max_idle_ns: 1911260446275000 ns > [ 0.337095] clocksource: Switched to clocksource kvm-clock > [ 0.345246] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, > max_idle_ns: 2085701024 ns > [ 0.356201] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: > 0x22983777dd9, max_idle_ns: 440795300422 ns > [ 0.360560] clocksource: Switched to clocksource tsc >

