On Thu, Jul 23, 2020 at 03:04:13PM -0400, Waiman Long wrote: > On 7/23/20 2:47 PM, pet...@infradead.org wrote: > > On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote: > > > BTW, do you have any comment on my v2 lock holder cpu info qspinlock > > > patch? > > > I will have to update the patch to fix the reported 0-day test problem, > > > but > > > I want to collect other feedback before sending out v3. > > I want to say I hate it all, it adds instructions to a path we spend an > > aweful lot of time optimizing without really getting anything back for > > it. > > It does add some extra instruction that may slow it down slightly, but I > don't agree that it gives nothing back. The cpu lock holder information can > be useful in analyzing crash dumps and in some debugging situation. I think > it can be useful in RHEL for this readon. How about an x86 config option to > allow distros to decide if they want to have it enabled? I will make sure > that it will have no performance degradation if the option is not enabled.
Config knobs suck too; they create a maintenance burden (we get to make sure all the permutations works/build/etc..) and effectively nobody uses them, since world+dog uses what distros pick. Anyway, instead of adding a second per-cpu variable, can you see how horrible something like this is: unsigned char adds(unsigned char var, unsigned char val) { unsigned short sat = 0xff, tmp = var; asm ("addb %[val], %b[var];" "cmovc %[sat], %[var];" : [var] "+r" (tmp) : [val] "ir" (val), [sat] "r" (sat) ); return tmp; } Another thing to try is, instead of threading that lockval throughout the thing, simply: #define _Q_LOCKED_VAL this_cpu_read_stable(cpu_sat) or combined with the above #define _Q_LOCKED_VAL adds(this_cpu_read_stable(cpu_number), 2) and see if the compiler really makes a mess of things.