On Fri, May 08, 2020 at 06:23:01PM -0400, Thor Lancelot Simon wrote: > On Fri, May 08, 2020 at 10:03:06AM +0200, Kamil Rytarowski wrote: > > On 08.05.2020 02:14, Thor Lancelot Simon wrote: > > > > > > Not without performance penalty for every atomic operation, unless you > > > propose > > > to do this by binary patch as is done in the kernel. > > > > There is atomic penalty, but it is the contract and design of this (C > > and C++) feature. Atomics can be legitimately lock-free or non-lock-free > > and this is a feature. > > Adding an extra test or, worse, an indirect branch before every atomic > operation makes it far worse than it has to be. An uncontended locked > transaction on the bus may cost nothing. An indirect branch followed > by the same transaction consumes all sorts of microarchitectural resources > will not just be slower but also impact the performance of _other_ code > too. That's why the kernel binary-patches out lock prefixes instead of > using indirection for atomics.
The indirection only applies to the first call. The magic is within rtld.