On Mon, 22 Dec 2014 16:06:04 +1100 Cyril Bur <cyril...@gmail.com> wrote:
> On POWER8 virtualised kernels the VTB register can be read to have a view of > time that only increases while the guest is running. This will prevent guests > from seeing time jump if a guest is paused for significant amounts of time. > > On POWER7 and below virtualised kernels stolen time is subtracted from > sched_clock as a best effort approximation. This will not eliminate spurious > warnings in the case of a suspended guest but may reduce the occurance in the > case of softlockups due to host over commit. > > Bare metal kernels should avoid reading the VTB as KVM does not restore sane > values when not executing. sched_clock is returned in this case. > > --- a/arch/powerpc/kernel/time.c > +++ b/arch/powerpc/kernel/time.c > @@ -621,6 +621,30 @@ unsigned long long sched_clock(void) > return mulhdu(get_tb() - boot_tb, tb_to_ns_scale) << tb_to_ns_shift; > } > > +unsigned long long running_clock(void) Non-kvm kernels don't need this code. Is there some appropriate "#ifdef CONFIG_foo" we can wrap this in? > +{ > + /* > + * Don't read the VTB as a host since KVM does not switch in host > timebase > + * into the VTB when it takes a guest off the CPU, reading the VTB would > + * result in reading 'last switched out' guest VTB. > + */ > + > + if (firmware_has_feature(FW_FEATURE_LPAR)) { > + if (cpu_has_feature(CPU_FTR_ARCH_207S)) > + return mulhdu(get_vtb() - boot_tb, tb_to_ns_scale) << > tb_to_ns_shift; > + > + /* This is a next best approximation without a VTB. */ > + return sched_clock() - > cputime_to_nsecs(kcpustat_this_cpu->cpustat[CPUTIME_STEAL]); Why is this result dependent on FW_FEATURE_LPAR? It's all generic code. In fact the kernel/sched/clock.c default implementation of running_clock() could use this expression. Would that be good or bad? :) > + } > + > + /* > + * On a host which doesn't do any virtualisation TB *should* equal VTB > so > + * it makes no difference anyway. > + */ > + > + return sched_clock(); > +} -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/