On Fri, 22 Jun 2018, Andy Lutomirski wrote: > On Fri, Jun 22, 2018 at 3:47 PM Thomas Gleixner <t...@linutronix.de> wrote: > > On Mon, 4 Jun 2018, Michael Rodin wrote: > > > > > The variable "vclocks_used" doesn't appear to be "read mostly". > > > Measurements of the access frequency with perf stat [1] and > > > perf report show, that approximately half of the accesses to > > > this variable are write accesses and happen in update_vsyscall() > > > in the file arch/x86/entry/vsyscall/vsyscall_gtod.c. > > > The measurements were done with the kernel 4.13.0-43-generic used by > > > ubuntu as well as with the stable kernel 4.16.7 with a custom config. > > > I've used "perf bench sched" and iperf3 as workloads. > > > > > > This was discovered during my master thesis in the CADOS project [2]. > > > > Nice find, but ... > > > > The point is that the content of that variable changes once in a blue moon, > > so the intent of marking it read_mostly is almost correct. > > I would propose a rather different fix. Add a an > arch_change_clocksource() function. Do: > > static inline void arch_change_clocksource(struct clocksource > *new_clocksource) { ... } > #define arch_change_clocksource arch_change_clocksource > > and > > #ifndef arch_change_clocksource > static inline void arch_change_clocksource(struct clocksource > *new_clocksource) {} > #endif > > in the generic header. In change_clocksource(), add a call to > arch_change_clocksource() right after tk_setup_internals(). In x86's > arch_change_clocksource, update vclocks_used. > > Now it's genuinely read_mostly, and we don't need to touch that > cacheline at all in the normal clock tick code. Everyone wins. > (vclocks_used is actually rather rarely read. It's only used to > prevent user code from accessing a never-used clocksource through the > vvar area, which is a hardening measure. It's only referenced from > the vvar fault handler code.)
Agreed. Thanks, tglx