> On August 6, 2019 at 7:14 AM Michael Ellerman <m...@ellerman.id.au> wrote: > > > Christopher M Riedl <c...@informatik.wtf> writes: > >> On August 2, 2019 at 6:38 AM Michael Ellerman <m...@ellerman.id.au> wrote: > >> "Christopher M. Riedl" <c...@informatik.wtf> writes: > >> > >> This leaves us with a double test of is_shared_processor() doesn't it? > > > > Yep, and that's no good. Hmm, executing the barrier() in the > > non-shared-processor > > case probably hurts performance here? > > It's only a "compiler barrier", so it shouldn't generate any code. > > But it does have the effect of telling the compiler it can't optimise > across that barrier, which can be important. > > In those spin loops all we're doing is checking lock->slock which is > already marked volatile in the definition of arch_spinlock_t, so the > extra barrier shouldn't really make any difference. > > But still the current code doesn't have a barrier() there, so we should > make sure we don't introduce one as part of this refactor.
Thank you for taking the time to explain this. I have some more reading to do about compiler-barriers it seems :) > > So I think you just want to change the call to spin_yield() above to > splpar_spin_yield(), which avoids the double check, and also avoids the > barrier() in the SPLPAR=n case. > > And then arch_spin_relax() calls spin_yield() etc. I submitted a v3 before your reply with this change already - figured this is the best way to avoid the double check and maintain legacy behavior. > > cheers