rg/all/zojf2dmbgw%2fzv...@google.com
>
> > 2. The sched_clock.
> >
> > The scheduling is impacted if there is big drift.
>
> ...
>
> > Unfortunately, the "no-kvmclock" kernel parameter disables all pv clock
> > operations (not only sched_clock
On Thu, Oct 19, 2023, David Woodhouse wrote:
> On Thu, 2023-10-19 at 08:40 -0700, Sean Christopherson wrote:
> > > If for some 'historical reasons' we can't revoke features we can always
> > > introduce a new PV feature bit saying that TSC is preferred.
>
On Tue, Jun 04, 2024, Kirill A. Shutemov wrote:
> On Mon, Jun 03, 2024 at 06:37:45AM -0700, Dave Hansen wrote:
> > On 6/2/24 04:54, Kirill A. Shutemov wrote:
> > > Sean observed that the compiler is generating inefficient code to clear
> > > the tdx_module_args struct for TDCALL and SEAMCALL wrappe
On Fri, Aug 02, 2024, David Woodhouse wrote:
> On Thu, 2024-08-01 at 20:54 +0200, Thomas Gleixner wrote:
> > On Thu, Aug 01 2024 at 16:14, Michael Kelley wrote:
> > > I don't have a convenient way to test my sequence on KVM.
> >
> > But still fails in KVM
>
> By KVM you mean the in-kernel one tha
On Fri, Aug 02, 2024, David Woodhouse wrote:
> On Fri, 2024-08-02 at 07:55 -0700, Sean Christopherson wrote:
> > On Fri, Aug 02, 2024, David Woodhouse wrote:
> > > On Thu, 2024-08-01 at 20:54 +0200, Thomas Gleixner wrote:
> > > > On Thu, Aug 01 2024 at 16:14, Mic
On Wed, Aug 28, 2024, Rick P Edgecombe wrote:
> On Tue, 2024-06-04 at 12:34 -0700, Sean Christopherson wrote:
> >
> > If we're willing to suffer a few gnarly macros, I think we get a
> > satisfactory mix of standardized arguments and explicit operands, and
> > gene
Dropping a few people/lists whose emails are bouncing.
On Fri, Jan 31, 2025, Sean Christopherson wrote:
> @@ -369,6 +369,11 @@ void __init kvmclock_init(void)
> #ifdef CONFIG_X86_LOCAL_APIC
> x86_cpuinit.early_percpu_clock_init = kvm_setup_secondary_clock;
On Sat, Feb 08, 2025, Michael Kelley wrote:
> From: Sean Christopherson Sent: Friday, February 7, 2025
> 9:23 AM
> >
> > Dropping a few people/lists whose emails are bouncing.
> >
> > On Fri, Jan 31, 2025, Sean Christopherson wrote:
> > > @@ -369,6 +3
On Tue, Feb 11, 2025, Borislav Petkov wrote:
> On Fri, Jan 31, 2025 at 06:17:05PM -0800, Sean Christopherson wrote:
>
> > Add a TODO to call out that AMD_MEM_ENCRYPT is a mess and doesn't depend on
> > HYPERVISOR_GUEST because it gates both guest and host code.
>
> W
On Wed, Feb 12, 2025, Michael Kelley wrote:
> From: Sean Christopherson Sent: Monday, February 10, 2025
> 8:22 AM
> > On Sat, Feb 08, 2025, Michael Kelley wrote:
> > > But I would be good with some restructuring so that setting the sched
> > > clock
> > >
warts created by kvmclock.
- Fix more bugs in kvmclock's suspend/resume handling.
- Try to harden kvmclock against future bugs.
v1: https://lore.kernel.org/all/20250201021718.699411-1-sea...@google.com
Sean Christopherson (38):
x86/tsc: Add a standalone helpers for getting TSC info
e TSC
frequency based on CPUID.0x16 when the core crystal frequency isn't known.
Opportunsitically drop native_calibrate_tsc()'s "== 0" and "!= 0" check
in favor of the kernel's preferred style.
No functional change intended.
Signed-off-by: Sean Christophers
functional change intended.
Signed-off-by: Sean Christopherson
---
arch/x86/include/asm/tsc.h | 1 +
arch/x86/kernel/tsc.c | 37 +++--
2 files changed, 24 insertions(+), 14 deletions(-)
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/t
guard against unwanted
usage. Add a TODO to call out that AMD_MEM_ENCRYPT is a mess and doesn't
depend on HYPERVISOR_GUEST because it gates both guest and host code.
No functional change intended.
Signed-off-by: Sean Christopherson
---
arch/x86/coco/sev/core.c | 4 ++--
arch/x86/i
dacky
Reviewed-by: Nikunj A Dadhania
Signed-off-by: Sean Christopherson
---
arch/x86/coco/sev/core.c | 3 ---
arch/x86/kernel/tsc.c| 3 ++-
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index e6ce4ca72465..dab386f782ce 1
explaining that CoCo TSC initialization
needs to come after hypervisor specific initialization.
Cc: Kirill A. Shutemov
Signed-off-by: Sean Christopherson
---
arch/x86/coco/tdx/tdx.c| 30 +++---
arch/x86/include/asm/tdx.h | 2 ++
arch/x86/kernel/tsc.c | 8
Mark the TSC frequency as known when using ACRN's PV CPUID information.
Per commit 81a71f51b89e ("x86/acrn: Set up timekeeping") and common sense,
the TSC freq is explicitly provided by the hypervisor.
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/cpu/acrn.c | 1 +
1
hich will be reduced in future cleanups) doesn't
meaningfully pollute generic code.
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/cpu/mshyperv.c | 58 --
drivers/clocksource/hyperv_timer.c | 50 ++
2 files changed, 50 insertions(
Now that all of the Hyper-V timer sched_clock code is located in a single
file, drop the superfluous wrappers for the save/restore flows.
No functional change intended.
Signed-off-by: Sean Christopherson
---
drivers/clocksource/hyperv_timer.c | 34 +-
include
functional change intended.
Signed-off-by: Sean Christopherson
---
arch/x86/include/asm/paravirt.h | 7 ++-
arch/x86/kernel/kvmclock.c | 5 +
arch/x86/kernel/paravirt.c | 6 +-
3 files changed, 12 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/paravirt.h b
PIC
got carried forward unnecessarily.
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/kvmclock.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index b898b95a7d50..80d1a06609c8 100644
--- a/arch/x86/kernel/k
re was, it belongs in common PV code).
Signed-off-by: Sean Christopherson
---
drivers/clocksource/hyperv_timer.c | 10 --
1 file changed, 10 deletions(-)
diff --git a/drivers/clocksource/hyperv_timer.c
b/drivers/clocksource/hyperv_timer.c
index 5a52d0acf31f..4a21874e91b9 100644
---
to kvmclock's suspend/resume logic.
Fixes: c02027b5742b ("x86/kvm: Disable kvmclock on all CPUs on shutdown")
Cc: sta...@vger.kernel.org
Signed-off-by: Sean Christopherson
---
arch/x86/include/asm/kvm_para.h | 8 +++-
arch/x86/kernel/kvm.c | 15 +++
Move kvmclock's sched_clock save/restore helper "up" so that they can
(eventually) be referenced by kvm_sched_clock_init().
No functional change intended.
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/kvmclock.c | 108 ++---
1 f
Nullify the x86_platform sched_clock save/restore hooks when setting up
Xen's PV clock to make it somewhat obvious the hooks aren't used when
running as a Xen guest (Xen uses a paravirtualized suspend/resume flow).
Signed-off-by: Sean Christopherson
---
arch/x86/xen/time.c | 6
;s safe/correct for VMware guests to
do nothing on suspend/resume, but that's a pre-existing problem. Leave it
for a VMware expert to sort out.
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/cpu/vmware.c | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86
Now that all PV clocksources override the sched_clock save/restore hooks
when overriding sched_clock, WARN if the "default" TSC hooks are invoked
when using a PV sched_clock, e.g. to guard against regressions.
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/tsc.c | 4 ++--
1 fi
that has
access to a "secure" TSC.
No functional change intended.
Signed-off-by: Sean Christopherson
---
arch/x86/include/asm/paravirt.h| 8 +---
arch/x86/kernel/cpu/vmware.c | 7 ++-
arch/x86/kernel/kvmclock.c | 5 ++---
arch/x86/kernel/paravirt.c
int to inline a
single-use function, and an extra CALL+RET pair during boot is a complete
non-issue. And, if the compiler ignores the hint and does NOT inline the
function, the resulting code may not get discarded after boot due lack of
an __init annotation.
No functional change intended.
Signed-off
Now that Xen PV clock and kvmclock explicitly do setup only during init,
tag the common PV clock flags/vsyscall variables and their mutators with
__init.
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/pvclock.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a
KVM_FEATURE_CLOCKSOURCE_STABLE_BIT is set could very, very theoretically
result in a change in behavior. In practice, the kernel only supports a
single PV clock.
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/kvmclock.c | 15 +++
1 file changed, 11 insertions(+), 4 deletions
WARN if the common PV clock valid_flags are overwritten; all PV clocks
expect that they are the one and only PV clock, i.e. don't guard against
another PV clock having modified the flags.
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/pvclock.c | 1 +
1 file changed, 1 inse
Annotate xen_setup_vsyscall_time_info() as being used only during kernel
initialization; it's called only by xen_time_init(), which is already
tagged __init.
Signed-off-by: Sean Christopherson
---
arch/x86/xen/time.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arc
ource supported by the kernel
depends on a persistent clock.
Signed-off-by: Sean Christopherson
---
kernel/time/timekeeping.c | 9 +++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 1e67d076f195..332d053fa9ce 100644
Save/restore kvmclock across suspend/resume via clocksource hooks when
kvmclock isn't being used for sched_clock. This will allow using kvmclock
as a clocksource (or for wallclock!) without also using it for sched_clock.
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/kvmclock.c
WARN if kvmclock is still suspended when its wallclock is read, i.e. when
the kernel reads its persistent clock. The wallclock subtly depends on
the BSP's kvmclock being enabled, and returns garbage if kvmclock is
disabled.
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/kvmclock.
TSC_RELIABLE when overriding the TSC calibration routine.
Cc: Tom Lendacky
Reviewed-by: Nikunj A Dadhania
Signed-off-by: Sean Christopherson
---
arch/x86/coco/sev/core.c | 2 ++
arch/x86/mm/mem_encrypt_amd.c | 3 ---
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/x86
rse" routine.
As a bonus, the flags also give developers working on new PV code a heads
up that they should at least mark the TSC as having a known frequency.
Signed-off-by: Sean Christopherson
---
arch/x86/coco/sev/core.c | 6 ++
arch/x86/coco/tdx/tdx.c| 7 ++-
s are invoked depends entirely on when a subsystem is
initialized and thus registers its hooks.
Opportunsitically make the registration messages more precise to help
debug issues where kvmclock is enabled too late.
Opportunstically WARN in kvmclock_{suspend,resume}() to harden against
future b
igned-off-by: Sean Christopherson
---
arch/x86/include/asm/paravirt.h | 9 +
arch/x86/kernel/paravirt.c | 4 ++--
2 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index dc26a3c26527..e6d5e77753c4 100644
extra" beyond
simply registering itself as sched_clock, i.e. is the only caller that
needs to check the new return value.
Signed-off-by: Sean Christopherson
---
arch/x86/include/asm/paravirt.h | 6 +++---
arch/x86/kernel/kvmclock.c | 7 +--
arch/x86/kernel/paravirt.c | 5 +++-
it's nonsensical, especially if the
hypervisor explicitly enumerates the CPU frequency.
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/kvmclock.c | 16 +++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmcl
TSC on such platforms is faster than any PV clock, and sched_clock
is all about speed.
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/paravirt.c | 9 +
1 file changed, 9 insertions(+)
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index a3a1359cfc26
rusted source of
information (hardware/firmware) is being discarded in favor of a less
trusted source (hypervisor).
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/tsc.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index be58df
ksource is usually
emulated in host userspace, i.e. reading the clock incurs a roundtrip
cost of thousands of cycles.
Marking the TSC reliable addresses a flaw where the TSC will occasionally
be marked unstable if the host is under moderate/heavy load.
Signed-off-by: Sean Christopherson
---
arch/
. acknowledging PVCLOCK_GUEST_STOPPED
needs to be decoupled from sched_clock() no matter what.
Link: https://lore.kernel.org/all/z4hdk27ov7wk5...@google.com
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/kvmclock.c | 16
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/
printk.
Rearranging the hook doesn't exactly reduce complexity; arguably it does
the opposite. But as-is, it's practically impossible to understand *why*
kvmclock needs to do early configuration.
Signed-off-by: Sean Christopherson
---
arch/x86/include/asm/paravirt.h | 10 --
using CPUID.0x15 will allow stuffing the local APIC timer
frequency based on the core crystal frequency, i.e. will allow skipping
APIC timer calibration.
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/kvmclock.c | 15 ++-
1 file changed, 10 insertions(+), 5 deletions(-)
diff
frequency
is enumerated via CPUID.0x15.
The APIC timer frequency will be the processor’s bus clock or core
crystal clock frequency (when TSC/core crystal clock ratio is enumerated
in CPUID leaf 0x15).
Signed-off-by: Sean Christopherson
---
arch/x86/kernel/kvmclock.c | 12 +++-
1 fi
On Fri, Feb 28, 2025, David Woodhouse wrote:
> On Wed, 2025-02-26 at 18:18 -0800, Sean Christopherson wrote:
> > This... snowballed a bit.
> >
> > The bulk of the changes are in kvmclock and TSC, but pretty much every
> > hypervisor's guest-side code gets touched
On Tue, Mar 04, 2025, Michael Kelley wrote:
> From: Sean Christopherson Sent: Wednesday, February 26,
> 2025 6:18 PM
> >
> > Register the Hyper-V timer callbacks or saving/restoring its PV sched_clock
>
> s/or/for/
>
> > if and only if the timer is ac
51 matches
Mail list logo